EUDAT Service Suite Overview - EUDAT Summer School (Shaun de Witt, CCFE)

www.eudat.eu
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065
The EUDAT Service Suite
Shaun de Witt

Learning Objective
Why Data Management is Important
Get an overview of the EUDAT services and how they link together

The Importance of Data Management
Research Infrastructure trends:
 Internationalisation
 Diversification
middle age 19th century 20th century 21st century
Large Scale Projects:
 SKA (300PB/yr)
LHC (600PB/yr – run 4)
Human Brain Project (21PB)
IPCC/CMIP (10PB ->150PB)
ITER (600PB/yr)
XFEL (10PB/yr)

Data Scientist Core Skills (Research Oriented)
From EDISON Data Science Framework

Data Management (40,000-6,000BC)
Good
Not too bad for long term preservation
Cheap materials
Bad
No key to interpretation
Slow write rates
Requires good stable conditions for
long term preservation
Easily corrupted/overwritten
Main Use
Recoding legendary parties after
hunting raids
Abstract hand art to confuse future
generations
© Niede Guidon/Bradshaw Fundation)

Data Management (9000-1,000BC)
Good
Excellent long term preservation with
the right materials
Cheapish materials
Difficult to corrupt/overwrite
Bad
No key to interpretation
VERY slow write rates
Difficult to re-use in a global
environment
Main Uses
Laws, accounting and taxes

Data Management (2500BC-present)
Good
Easy to manufacture materials
Much finer details than previous effort
(better bit density)
OK long term preservation
Improved data movement
Bad
Needs right conditions
Easily lost/fragile
Cataloguing and indexing was initially a
problem
Main Uses
Laws, accounting and taxes
Sketching, writing, love letters, paper
airplanes…

Data Management (1AD-present)
Good
Easy to manufacture materials
Simpler to organise
OK long term preservation
Highly portable
Bad
Needs right conditions
Easily lost/fragile
Optical Character Recognition not
Invented
Loss of knowledge over time
Main Uses
Meeting minutes, meeting actions,
doodling, Remembering the date,

Data Management (1890s – 1990s)
Good
Excellent longevity
Very high data density (35mm frame is
equivalent to about 40 Mpixels)
Highly portable
Reproducible
Bad
Sometimes takes several attempts to
produce good data
Fragile
Subject to noise
Not very metadata rich (difficult to
index)
Main Uses
Pretty astronomical pictures, Family
albums, Selfies…

Data Management (1920s-1970s)
Good
Digital – sort of
Reproducible and transportable???
Bad
Don’t drop this!
Low data density
Not designed to any standard
Main Uses
Maths,
5MB
150kW
100 word memory
0.00005 MIPS
© 2005 Paul W Shaffer, University of Pennsylvania

Data Management (1950s – 1990s)
Good
Really programmable
Proper digital.
Reproducible and Transportable
Bad
Tapes and disk were
Low data density
Not designed to any standard
Main Uses
Maths, Bad 1970’s movie backdrops,
Calculating taxes140MB
170MB
1.25MB/s
~ 1MW
~1MB memory
~16 MIPS

Data Management (1980s-present)
Good
Cheap
Powerful
Clusterable, configurable
Bad
MS-DOS, Windows 3.1, OS-2
Main Uses
Chuckie Egg, Solitaire, Manic Miner
< 1MW
~256kB memory
~2 MIPS
1.44MB
5-14 GB
500 kB/s
1 TB

Data Management (1960s - present)
Good
Powerful
Scalable
Bad
MS-DOS, Windows 3.1, OS-2
Main Uses
Chuckie Egg, Solitaire, Manic Miner
~ 15MW
~1.3PB memory
~93TFLOPS
~900TB
>500PB
~50MB/s

Data Management – A Personal Perspective
Data Management is NOT just about data preservation
What data do I need to preserve
What data do I want to make visible
What legal frameworks do I need to adhere to
Who can access my data
Where do I need to move my data to
And when do I need to move it

EUDAT Service Suite
During this course you will learn:
How services link to data lifecycle
How services support the FAIR principle
How to use the Web Interface
Where available
How to use the APIs available to access services programmatically.

Community-Driven Solutions
EUDAT services are
designed, built and
implemented based
on user community
requirements.

EUDAT Data Domain modeled on the ANDS1 Data Curation Continuum
1. Australian National Data Service organization – www.ands.org.au
Data Domains

Help desk
Monitoring
Collaboration Tools (restricted access)
Service Catalogue and registry
Input to requirements
Data project
co-ordination
More than Just Services

Secure Access to Services
b2access.eudat.eu
www.eudat.eu
B2ACCESS
B2ACCESS is an easy-to-use and secure authentication and
authorization platform which can be integrated with any
service and supports different methods of authentication.

An easy-to-use and secure
authentication and authorization
platform integrated with any
services
The user may log in by using
different methods of
authentication:
Home organisation identity
provider
Social ID
EUDAT ID
Allows group-, community- and
service managers to specify
authorisation decisions
Features:
Easy integration in any service
Reliable and light-weight
Powerful management interface

b2drop.eudat.eu
www.eudat.eu
Sync and Share Research Data
B2DROP
EUDAT’s Personal Cloud Storage Service
B2DROP is a secure and trusted data exchange service for
researchers and scientists to keep their research data synchronized
and up-to-date and to exchange with others.

Store and exchange data with
colleagues and team members,
including research data not
finalized for publishing
share data with fine-grained
access controls
synchronize multiple versions of
data across different devices
An ideal solution for researchers and scientists to:
Features:
20 GB storage per user
Living objects, so no PIDs
Versioning and offline use
Desktop synchronisation

Store and Publish Research Data
b2share.eudat.eu
www.eudat.eu
B2SHARE
B2SHARE is a user-friendly, reliable and trustworthy way for
researchers, scientific communities and scientists to store and share
small-scale research data from diverse contexts.

store data safely at a trusted
and certified data centre
preserve data to guarantee
long-term persistence
control access and share data
with colleagues and the world
A winning solution for researchers, scientists and communities
to:
Features:
Metadata management
Permanent PIDs
Open Access support

Replicate Research Data Safely
eudat.eu/b2safe
www.eudat.eu
B2SAFE
B2SAFE is a robust, safe and highly available service which allows
community and departmental repositories to implement data
management policies on research data across multiple administrative
domains in a trustworthy manner.

replicate research data into secure
data stores
archive and preserve research data
in the long-term
bring data close to powerful
compute resources
co-locate data with different
communities
benefit from economies of scale
The ideal solution for communities with no facility for archival
to:
Features:
Large-scale storage
Robust and highly available
Permanent PIDs

Get Data to Computation
eudat.eu/b2stage
www.eudat.eu
B2STAGE
B2STAGE is a reliable, efficient, light-weight and easy-to-use service
to transfer research data sets between EUDAT storage resources and
high-performance computing (HPC) workspaces

move large amounts of data
between data stores and high-
performance compute resources
re-ingest computational results
back into EUDAT
deposit large data sets onto EUDAT
resources for long-term preservation
Facilitating communities to:
Features:
High-speed transfer
Reliable and light-weight
Manages permanent PIDs

Find Research Data
b2find.eudat.eu
www.eudat.eu
B2FIND
B2FIND is a simple, user-friendly metadata catalogue of
research data collections stored in EUDAT data centres and
other repositories.

seek data objects and collections
using powerful metadata searches
catalogue community data by
means of selected metadata
browse through multi-disciplinary
data collections filtered by content,
provenance and temporal keywords
A metadata catalogue service to:
Features:
Simple to use
Standards-based
Comprehensive catalogue

Data Discovery and Identification
b2handle.eudat.eu
www.eudat.eu
B2HANDLE
B2HANDLE provides an abstraction layer between a globally
unique persistent identifier and a physical location of a data
object allowing researchers to reliably cite and refer in the
long term.

Provides abstraction layer between
a globally unique persistent
identifier and physical location
of data objects
Follows policies to register data
and make it long term refer- and
citable
Features:
Reliability through mutual PID mirroring
Machine readable via HTTP RESTful API
Simple integration with any service
Technology agnostic

For more info:
b2drop.eudat.eu
eudat.eu/services/userdoc/b2drop
b2share.eudat.eu
eudat.eu/services/userdoc/b2share
eudat.eu/services/userdoc/b2safe
b2find.eudat.eu
eudat.eu/services/userdoc/b2find
b2access.eudat.eu
eudat.eu/services/userdoc/b2access-usage
eudat.eu/services/userdoc/b2handle
eudat.eu/b2stage
eudat.eu/services/userdoc/b2stage

EUDAT Service Suite Overview - EUDAT Summer School (Shaun de Witt, CCFE)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to EUDAT Service Suite Overview - EUDAT Summer School (Shaun de Witt, CCFE)

Similar to EUDAT Service Suite Overview - EUDAT Summer School (Shaun de Witt, CCFE) (20)

More from EUDAT

More from EUDAT (20)

Recently uploaded

Recently uploaded (20)

EUDAT Service Suite Overview - EUDAT Summer School (Shaun de Witt, CCFE)

Editor's Notes