SlideShare a Scribd company logo
1 of 18
Buyers Group
Deployments Scenarios
Evangelos Motesnitsalis
Technical Coordinator
OMC Kick-off Event
8 April 2019
10/04/2019 http://www.archiver-project.eu 2
Contents
OAIS Reference Model
FAIR Principles
Deployment Scenarios
Buyers Group Goals
High Energy Phyics Goals
Life Science Goals
Astronomy Goals
Photon Science Goals
Data Volumes
Data Ingest Rates
Retention Period
Summary
OAIS and FAIR
10/04/2019 http://www.archiver-project.eu 4
OAIS Reference Model
Relevant Standards
Preservation: ISO 14721/16393, 26324 and related standards
Storage/Basic Archiving/Secure backup: ISO 27000, 27040, 19086
10/04/2019 http://www.archiver-project.eu 5
FAIR Principles
Findable
AccessibleInteroperable
Re-Usable
• Accurate and relevant description
• Data usage license and detailed
provenance
• Retrievable with free protocols
• Accessible metadata even after
deletion
• Global, unique identifiers
• Rich Metadata, indexes, search
capabilities
• Qualified reference to other data
• Formal, shared and broadly applicable
knowledge representation standards
https://www.go-fair.org/
Deployment Scenarios
Initial List of Deployment Scenarios
Field Scenario Name
High Energy Physics
[4]
BaBar Archive Stage 1
DPHEP EOSC Science Demonstrator
CERN Open Data Cloud Archive Services / CODCAS
CERN E-Ternity
Life Sciences
[2]
EMBL/FIRE
EMBL Cloud-caching for Data Analysis
Astronomy and Cosmology [3] Second copy of data for Disaster Recovery / DISASTER
Analysis dataset server for gamma-ray astronomy / GAMMADAT
Open Data Publisher / OPENPUB
Photon Science
[3]
Photon-Science/Scientist
Photon-Science/Working Group
Photon Science/Collaboration
10/04/2019 http://www.archiver-project.eu 7
10/04/2019 http://www.archiver-project.eu 8
High Energy Physics Scenario Goals
In 2020 the BaBar Experiment infrastructure at SLAC will be decommissioned. As a result, BaBar
data [2 PBs] can no longer be stored at the host laboratory and alternative solutions need to be
found. Currently a copy of the data is being held by CERN IT. We want to ensure that a complete
copy of Babar data will be retained for possible comparisons with data from other experiments
and sharing through the CERN Open Data Portal.
The CERN Open Data portal disseminates close to 2 PBs of open particle physics data released by
LHC experiments and is being used for both education and research purposes. We want to
establish a “passive” data archive for disaster-recovery purposes as well as an additional “active”,
exposed via protocols such as S3 and XRootD, which will allow users to run open data analysis
examples.
We want to archive the ~1 PB of CERN Digital Memory, containing analog documents produced by
the institution in the 20th century as well as digital production of the 21st century, including new
types like web sites, social medias, emails, etc.
10/04/2019 http://www.archiver-project.eu 9
Life Sciences Scenario Goals
EMBL-EBI provides data archiving services to the global molecular biology community. These
data archives are currently based on an internal service (FIRE: FIle REplication) that stores the
files in two different systems: a distributed object store and tape.
FIRE currently holds 20PB of data and is growing at 40% per year. We want to ensure that:
FIRE can achieve cost-effective scaling via cloud-based storage solutions
Data can effectively be distributed on cloud infrastructure, covering the increasing needs for cloud-hosted analysis
As research communities access more and more of internal data from cloud services for their
data analysis, it makes sense to progressively cache/store data in the cloud, with the on-
premises data being replicated and discarded as required.
Which data should be cached/stored, how much and for how long, will be a tradeoff between
the cost of cloud storage and of having the network capacity/latency to download the data
multiple times.
10/04/2019 http://www.archiver-project.eu 10
The MAGIC Cherenkov gamma-ray telescopes and the PAUcam camera for
the William Herschel Telescope are located in the Observatorio del Roque de
los Muchachos, in Canary Islands, Spain. The first Large Scale Telescope of
the next-generation Cherenkov Telescope Array (CTA) is also there.
They produce about 0.3 PB of raw data per year which is automatically sent
to PIC in Barcelona.
Data are rarely recalled –less than once per year – but whenever required,
they must be accessible within 3 weeks.
Our goal is:
to ensure that a second copy of data is retained for disaster recovery purposes.
to replace the current data distribution service at PIC by a commercial service with better
functionality, easier maintenance and lower cost.
to acquire a method to publish certain datasets as Open Data according to Digital Library
standards and link them to publications.
Astronomy Scenario Goals
10/04/2019 http://www.archiver-project.eu 11
Photon Science Scenario Goals
Individual scientists at DESY need a service to create archives for their experiment data as
well as their publications with specific capabilities such as continuous data ingestion via
browser or third-party copies
Working groups want to be able to create/manage/delete archives based on accepted data
policies supporting a wide range of options for cloud and on-prem storage, while being
able to utilize existing user credentials, authentication techniques and identification
mechanisms.
Long-lived collaborations present a growing need to plan and execute archiving operations
in a fully automated and policy-based, certified, documented way via API and a close to
100% automated procedures.
Data Characteristics
Data Volumes
Type Deployment Scenario Name Data Volumes
Low Range Scenarios
[3]
Analysis dataset server for gamma-ray astronomy /
GAMMADAT
0.01 PB
Open Data Publisher / OPENPUB 0.01 PB
DPHEP EOSC Science Demonstrator 0.1+ PB
Medium Range Scenarios
[3]
Photon-Science/Scientist 0.5 PB
EMBL Cloud-caching for Data Analysis 0.5 PB
CERN E-Ternity 0.7 PB
High Range Scenarios
[6]
Second copy of data for Disaster Recovery / DISASTER 0.3 PB / year
Photon-Science/Working Group 1 PB
BaBar Archive Stage 1 2 PB
CERN Open Data Cloud Archive Services / CODCAS 2+ PB
EMBL on Fire 20+ PB
Photon Science/Collaboration 100 PB
10/04/2019 http://www.archiver-project.eu 13
Retention Period
10/04/2019 http://www.archiver-project.eu 14
Type Deployment Scenario Name Retention Period
Short Retention Period [2] Second copy of data for Disaster Recovery / DISASTER <5 years
EMBL Cloud-caching for Data Analysis <5 years
Medium Retention Period [8] Photon Science/Collaboration 10+ years
Photon-Science/Working Group 10+ years
Photon-Science/Scientist 10+ years
BaBar Archive Stage 1 10 years
DPHEP EOSC Science Demonstrator 10 years
Analysis dataset server for gamma-ray astronomy /
GAMMADAT
10+ years
CERN Open Data Cloud Archive Services / CODCAS 5 - 10 years
CERN E-Ternity 10+ years
Long Retention Period [2] Open Data Publisher / OPENPUB 25+ years
EMBL on Fire 25+ years
Data Ingest Rates
10/04/2019 http://www.archiver-project.eu 15
Type Deployment Scenario Name Data Ingest Rates
Low Rates [1] CERN E-Ternity 0.01 GB/s
Medium Rates
[3]
CERN Open Data Cloud Archive Services / CODCAS 1 GB/s
Photon-Science/Scientist 1-2 GB/s
EMBL on Fire 1 – 2 GB/s
High Rates
[7]
Second copy of data for Disaster Recovery / DISASTER 1 – 10 GB/s
Photon-Science/Working Group 1-10 GB/s
Analysis dataset server for gamma-ray astronomy /
GAMMADAT
1 – 10 GB/s
BaBar Archive Stage 1 1 – 10 GB/s
EMBL Cloud-caching for Data Analysis 1 – 10 GB/s
DPHEP EOSC Science Demonstrator 1 – 10 GB/s
Open Data Publisher / OPENPUB 1 – 10 GB/s
Very High Rates [1] Photon Science/Collaboration 8-20 GB/s
Overview
10/04/2019 http://www.archiver-project.eu 16
Summary and Next Steps
10/04/2019 http://www.archiver-project.eu 18
Summary and Next Steps
The objective of ARCHIVER is to perform R&D to demonstrate functionality and
performance of services for long-term preservation and archiving for scientific data in the
PB range under F.A.I.R. principles, while ensuring that research groups will retain
stewardship of their data sets
ARCHIVER Pre-Commercial Procurement will run an open tender and the resulting services
will be integrated on the EOSC catalogue and made broadly accessible to various
organizations
We welcome your feedback on the draft of the “Functional Specifications” document which
will be released shortly after this event
The Buyers group will co-design and co-develop with you a test plan - based on the
outcome of the Design Phase, the Functional Specifications and the Deployment Scenarios
The test assessment will be a deciding factor to qualify solutions to the subsequent phases
The tests will focus on basic functionality capabilities during the prototype phase and
performance, efficiency, and scalability during the pilot phase

More Related Content

What's hot

Rutherford Appleton Laboratory uses Panasas ActiveStor to accelerate global c...
Rutherford Appleton Laboratory uses Panasas ActiveStor to accelerate global c...Rutherford Appleton Laboratory uses Panasas ActiveStor to accelerate global c...
Rutherford Appleton Laboratory uses Panasas ActiveStor to accelerate global c...Panasas
 
Design phase kick-off event and Ceremony
Design phase kick-off event and CeremonyDesign phase kick-off event and Ceremony
Design phase kick-off event and CeremonyArchiver
 
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and CeremonyPrototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and CeremonyArchiver
 
ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)
ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)
ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)ESCAPE EU
 
Overview of the EOSC¶
Overview of the EOSC¶Overview of the EOSC¶
Overview of the EOSC¶Archiver
 

What's hot (6)

Rutherford Appleton Laboratory uses Panasas ActiveStor to accelerate global c...
Rutherford Appleton Laboratory uses Panasas ActiveStor to accelerate global c...Rutherford Appleton Laboratory uses Panasas ActiveStor to accelerate global c...
Rutherford Appleton Laboratory uses Panasas ActiveStor to accelerate global c...
 
Design phase kick-off event and Ceremony
Design phase kick-off event and CeremonyDesign phase kick-off event and Ceremony
Design phase kick-off event and Ceremony
 
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and CeremonyPrototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
 
Archive Information Packages for NASA HDF-EOS Data
Archive Information Packages for NASA HDF-EOS DataArchive Information Packages for NASA HDF-EOS Data
Archive Information Packages for NASA HDF-EOS Data
 
ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)
ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)
ESCAPE Kick-off meeting - HL-LHC ESFRI Landmark (Feb 2019)
 
Overview of the EOSC¶
Overview of the EOSC¶Overview of the EOSC¶
Overview of the EOSC¶
 

Similar to 3 archiver omc deployment_scenarios

Archiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver at CS3 - Cloud Storage Synchronization and Sharing ServicesArchiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver at CS3 - Cloud Storage Synchronization and Sharing ServicesArchiver
 
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...Archiver
 
Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...
Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...
Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...EOSC-hub project
 
Progress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP ProjectProgress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP ProjectHelix Nebula The Science Cloud
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair" OpenAIRE
 
Helix Nebula the Science Cloud: Pre-Commercial Procurement pilot
Helix Nebula the Science Cloud: Pre-Commercial Procurement pilotHelix Nebula the Science Cloud: Pre-Commercial Procurement pilot
Helix Nebula the Science Cloud: Pre-Commercial Procurement pilotHelix Nebula The Science Cloud
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver
 
A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...
A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...
A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...DataWorks Summit/Hadoop Summit
 
Repositorio de Datos LAGO
Repositorio de Datos LAGORepositorio de Datos LAGO
Repositorio de Datos LAGORodrigo Torrens
 
OSFair2017 Workshop | The European Open Science Cloud Pilot
OSFair2017 Workshop | The European Open Science Cloud Pilot OSFair2017 Workshop | The European Open Science Cloud Pilot
OSFair2017 Workshop | The European Open Science Cloud Pilot Open Science Fair
 
Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...
Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...
Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...confluent
 
Archiver 3rd omc_project_overview
Archiver 3rd omc_project_overviewArchiver 3rd omc_project_overview
Archiver 3rd omc_project_overviewArchiver
 
Science Demonstrator Session: Life and Materials Sciences
Science Demonstrator Session: Life and Materials SciencesScience Demonstrator Session: Life and Materials Sciences
Science Demonstrator Session: Life and Materials SciencesEOSCpilot .eu
 
1 archiver omc project_overview
1 archiver omc project_overview1 archiver omc project_overview
1 archiver omc project_overviewArchiver
 
Ensuring Continuing Access to Online Scholarly Resources Stewardship & Servic...
Ensuring Continuing Access to Online Scholarly Resources Stewardship & Servic...Ensuring Continuing Access to Online Scholarly Resources Stewardship & Servic...
Ensuring Continuing Access to Online Scholarly Resources Stewardship & Servic...EDINA, University of Edinburgh
 
HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board  HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board Helix Nebula The Science Cloud
 
PHIDIAS - Boosting the use of cloud services for marine data management, serv...
PHIDIAS - Boosting the use of cloud services for marine data management, serv...PHIDIAS - Boosting the use of cloud services for marine data management, serv...
PHIDIAS - Boosting the use of cloud services for marine data management, serv...Phidias
 
Artificial Intelligence in the Earth Observation Domain: Current European Res...
Artificial Intelligence in the Earth Observation Domain: Current European Res...Artificial Intelligence in the Earth Observation Domain: Current European Res...
Artificial Intelligence in the Earth Observation Domain: Current European Res...ExtremeEarth
 

Similar to 3 archiver omc deployment_scenarios (20)

Archiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver at CS3 - Cloud Storage Synchronization and Sharing ServicesArchiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver at CS3 - Cloud Storage Synchronization and Sharing Services
 
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...
 
Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...
Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...
Gergely Sipos, Claudio Cacciari: Welcome and mapping the landscape: EOSC-hub ...
 
Progress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP ProjectProgress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP Project
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair"
 
Helix Nebula the Science Cloud: Pre-Commercial Procurement pilot
Helix Nebula the Science Cloud: Pre-Commercial Procurement pilotHelix Nebula the Science Cloud: Pre-Commercial Procurement pilot
Helix Nebula the Science Cloud: Pre-Commercial Procurement pilot
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award Ceremony
 
Archiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award CeremonyArchiver pilot phase kick off Award Ceremony
Archiver pilot phase kick off Award Ceremony
 
A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...
A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...
A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...
 
Repositorio de Datos LAGO
Repositorio de Datos LAGORepositorio de Datos LAGO
Repositorio de Datos LAGO
 
GBIF Work Programme 2016 Update
GBIF Work Programme 2016 UpdateGBIF Work Programme 2016 Update
GBIF Work Programme 2016 Update
 
OSFair2017 Workshop | The European Open Science Cloud Pilot
OSFair2017 Workshop | The European Open Science Cloud Pilot OSFair2017 Workshop | The European Open Science Cloud Pilot
OSFair2017 Workshop | The European Open Science Cloud Pilot
 
Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...
Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...
Kafka Summit SF 2017 - Accelerating Particles to Explore the Mysteries of the...
 
Archiver 3rd omc_project_overview
Archiver 3rd omc_project_overviewArchiver 3rd omc_project_overview
Archiver 3rd omc_project_overview
 
Science Demonstrator Session: Life and Materials Sciences
Science Demonstrator Session: Life and Materials SciencesScience Demonstrator Session: Life and Materials Sciences
Science Demonstrator Session: Life and Materials Sciences
 
1 archiver omc project_overview
1 archiver omc project_overview1 archiver omc project_overview
1 archiver omc project_overview
 
Ensuring Continuing Access to Online Scholarly Resources Stewardship & Servic...
Ensuring Continuing Access to Online Scholarly Resources Stewardship & Servic...Ensuring Continuing Access to Online Scholarly Resources Stewardship & Servic...
Ensuring Continuing Access to Online Scholarly Resources Stewardship & Servic...
 
HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board  HNSciCloud update @ the World LHC Computing Grid deployment board
HNSciCloud update @ the World LHC Computing Grid deployment board
 
PHIDIAS - Boosting the use of cloud services for marine data management, serv...
PHIDIAS - Boosting the use of cloud services for marine data management, serv...PHIDIAS - Boosting the use of cloud services for marine data management, serv...
PHIDIAS - Boosting the use of cloud services for marine data management, serv...
 
Artificial Intelligence in the Earth Observation Domain: Current European Res...
Artificial Intelligence in the Earth Observation Domain: Current European Res...Artificial Intelligence in the Earth Observation Domain: Current European Res...
Artificial Intelligence in the Earth Observation Domain: Current European Res...
 

More from Archiver

Wrapping Up and Next Steps¶
Wrapping Up and Next Steps¶Wrapping Up and Next Steps¶
Wrapping Up and Next Steps¶Archiver
 
ARCHIVER Tender Requirements
ARCHIVER Tender RequirementsARCHIVER Tender Requirements
ARCHIVER Tender RequirementsArchiver
 
Wrapping up and_next_steps_stansted
Wrapping up and_next_steps_stanstedWrapping up and_next_steps_stansted
Wrapping up and_next_steps_stanstedArchiver
 
20190523 archiver fim
20190523 archiver fim20190523 archiver fim
20190523 archiver fimArchiver
 
Geant cloud peering-v2
Geant cloud peering-v2Geant cloud peering-v2
Geant cloud peering-v2Archiver
 
Archiver omc stansted_tendering_procedure_and_requirements_final
Archiver omc stansted_tendering_procedure_and_requirements_finalArchiver omc stansted_tendering_procedure_and_requirements_final
Archiver omc stansted_tendering_procedure_and_requirements_finalArchiver
 
Wrapping up_and_next_steps
Wrapping up_and_next_stepsWrapping up_and_next_steps
Wrapping up_and_next_stepsArchiver
 
Introduction to_planning_poker_addestino
Introduction to_planning_poker_addestinoIntroduction to_planning_poker_addestino
Introduction to_planning_poker_addestinoArchiver
 
Archiver 2nd_OMC event_Barcelona_Project Overview
Archiver 2nd_OMC event_Barcelona_Project OverviewArchiver 2nd_OMC event_Barcelona_Project Overview
Archiver 2nd_OMC event_Barcelona_Project OverviewArchiver
 
Archiver OMC event_Barcelona_ Welcome to_accio
Archiver OMC event_Barcelona_ Welcome to_accio Archiver OMC event_Barcelona_ Welcome to_accio
Archiver OMC event_Barcelona_ Welcome to_accio Archiver
 
6 presentation wrapping up and next steps v2
6 presentation wrapping up and next steps v26 presentation wrapping up and next steps v2
6 presentation wrapping up and next steps v2Archiver
 
5 introduction to geant
5 introduction to geant5 introduction to geant
5 introduction to geantArchiver
 
4 archiver omc session 1
4 archiver omc session 1 4 archiver omc session 1
4 archiver omc session 1 Archiver
 
2 procurement and legal aspects
2 procurement and legal aspects 2 procurement and legal aspects
2 procurement and legal aspects Archiver
 

More from Archiver (14)

Wrapping Up and Next Steps¶
Wrapping Up and Next Steps¶Wrapping Up and Next Steps¶
Wrapping Up and Next Steps¶
 
ARCHIVER Tender Requirements
ARCHIVER Tender RequirementsARCHIVER Tender Requirements
ARCHIVER Tender Requirements
 
Wrapping up and_next_steps_stansted
Wrapping up and_next_steps_stanstedWrapping up and_next_steps_stansted
Wrapping up and_next_steps_stansted
 
20190523 archiver fim
20190523 archiver fim20190523 archiver fim
20190523 archiver fim
 
Geant cloud peering-v2
Geant cloud peering-v2Geant cloud peering-v2
Geant cloud peering-v2
 
Archiver omc stansted_tendering_procedure_and_requirements_final
Archiver omc stansted_tendering_procedure_and_requirements_finalArchiver omc stansted_tendering_procedure_and_requirements_final
Archiver omc stansted_tendering_procedure_and_requirements_final
 
Wrapping up_and_next_steps
Wrapping up_and_next_stepsWrapping up_and_next_steps
Wrapping up_and_next_steps
 
Introduction to_planning_poker_addestino
Introduction to_planning_poker_addestinoIntroduction to_planning_poker_addestino
Introduction to_planning_poker_addestino
 
Archiver 2nd_OMC event_Barcelona_Project Overview
Archiver 2nd_OMC event_Barcelona_Project OverviewArchiver 2nd_OMC event_Barcelona_Project Overview
Archiver 2nd_OMC event_Barcelona_Project Overview
 
Archiver OMC event_Barcelona_ Welcome to_accio
Archiver OMC event_Barcelona_ Welcome to_accio Archiver OMC event_Barcelona_ Welcome to_accio
Archiver OMC event_Barcelona_ Welcome to_accio
 
6 presentation wrapping up and next steps v2
6 presentation wrapping up and next steps v26 presentation wrapping up and next steps v2
6 presentation wrapping up and next steps v2
 
5 introduction to geant
5 introduction to geant5 introduction to geant
5 introduction to geant
 
4 archiver omc session 1
4 archiver omc session 1 4 archiver omc session 1
4 archiver omc session 1
 
2 procurement and legal aspects
2 procurement and legal aspects 2 procurement and legal aspects
2 procurement and legal aspects
 

Recently uploaded

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Recently uploaded (20)

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

3 archiver omc deployment_scenarios

  • 1. Buyers Group Deployments Scenarios Evangelos Motesnitsalis Technical Coordinator OMC Kick-off Event 8 April 2019
  • 2. 10/04/2019 http://www.archiver-project.eu 2 Contents OAIS Reference Model FAIR Principles Deployment Scenarios Buyers Group Goals High Energy Phyics Goals Life Science Goals Astronomy Goals Photon Science Goals Data Volumes Data Ingest Rates Retention Period Summary
  • 4. 10/04/2019 http://www.archiver-project.eu 4 OAIS Reference Model Relevant Standards Preservation: ISO 14721/16393, 26324 and related standards Storage/Basic Archiving/Secure backup: ISO 27000, 27040, 19086
  • 5. 10/04/2019 http://www.archiver-project.eu 5 FAIR Principles Findable AccessibleInteroperable Re-Usable • Accurate and relevant description • Data usage license and detailed provenance • Retrievable with free protocols • Accessible metadata even after deletion • Global, unique identifiers • Rich Metadata, indexes, search capabilities • Qualified reference to other data • Formal, shared and broadly applicable knowledge representation standards https://www.go-fair.org/
  • 7. Initial List of Deployment Scenarios Field Scenario Name High Energy Physics [4] BaBar Archive Stage 1 DPHEP EOSC Science Demonstrator CERN Open Data Cloud Archive Services / CODCAS CERN E-Ternity Life Sciences [2] EMBL/FIRE EMBL Cloud-caching for Data Analysis Astronomy and Cosmology [3] Second copy of data for Disaster Recovery / DISASTER Analysis dataset server for gamma-ray astronomy / GAMMADAT Open Data Publisher / OPENPUB Photon Science [3] Photon-Science/Scientist Photon-Science/Working Group Photon Science/Collaboration 10/04/2019 http://www.archiver-project.eu 7
  • 8. 10/04/2019 http://www.archiver-project.eu 8 High Energy Physics Scenario Goals In 2020 the BaBar Experiment infrastructure at SLAC will be decommissioned. As a result, BaBar data [2 PBs] can no longer be stored at the host laboratory and alternative solutions need to be found. Currently a copy of the data is being held by CERN IT. We want to ensure that a complete copy of Babar data will be retained for possible comparisons with data from other experiments and sharing through the CERN Open Data Portal. The CERN Open Data portal disseminates close to 2 PBs of open particle physics data released by LHC experiments and is being used for both education and research purposes. We want to establish a “passive” data archive for disaster-recovery purposes as well as an additional “active”, exposed via protocols such as S3 and XRootD, which will allow users to run open data analysis examples. We want to archive the ~1 PB of CERN Digital Memory, containing analog documents produced by the institution in the 20th century as well as digital production of the 21st century, including new types like web sites, social medias, emails, etc.
  • 9. 10/04/2019 http://www.archiver-project.eu 9 Life Sciences Scenario Goals EMBL-EBI provides data archiving services to the global molecular biology community. These data archives are currently based on an internal service (FIRE: FIle REplication) that stores the files in two different systems: a distributed object store and tape. FIRE currently holds 20PB of data and is growing at 40% per year. We want to ensure that: FIRE can achieve cost-effective scaling via cloud-based storage solutions Data can effectively be distributed on cloud infrastructure, covering the increasing needs for cloud-hosted analysis As research communities access more and more of internal data from cloud services for their data analysis, it makes sense to progressively cache/store data in the cloud, with the on- premises data being replicated and discarded as required. Which data should be cached/stored, how much and for how long, will be a tradeoff between the cost of cloud storage and of having the network capacity/latency to download the data multiple times.
  • 10. 10/04/2019 http://www.archiver-project.eu 10 The MAGIC Cherenkov gamma-ray telescopes and the PAUcam camera for the William Herschel Telescope are located in the Observatorio del Roque de los Muchachos, in Canary Islands, Spain. The first Large Scale Telescope of the next-generation Cherenkov Telescope Array (CTA) is also there. They produce about 0.3 PB of raw data per year which is automatically sent to PIC in Barcelona. Data are rarely recalled –less than once per year – but whenever required, they must be accessible within 3 weeks. Our goal is: to ensure that a second copy of data is retained for disaster recovery purposes. to replace the current data distribution service at PIC by a commercial service with better functionality, easier maintenance and lower cost. to acquire a method to publish certain datasets as Open Data according to Digital Library standards and link them to publications. Astronomy Scenario Goals
  • 11. 10/04/2019 http://www.archiver-project.eu 11 Photon Science Scenario Goals Individual scientists at DESY need a service to create archives for their experiment data as well as their publications with specific capabilities such as continuous data ingestion via browser or third-party copies Working groups want to be able to create/manage/delete archives based on accepted data policies supporting a wide range of options for cloud and on-prem storage, while being able to utilize existing user credentials, authentication techniques and identification mechanisms. Long-lived collaborations present a growing need to plan and execute archiving operations in a fully automated and policy-based, certified, documented way via API and a close to 100% automated procedures.
  • 13. Data Volumes Type Deployment Scenario Name Data Volumes Low Range Scenarios [3] Analysis dataset server for gamma-ray astronomy / GAMMADAT 0.01 PB Open Data Publisher / OPENPUB 0.01 PB DPHEP EOSC Science Demonstrator 0.1+ PB Medium Range Scenarios [3] Photon-Science/Scientist 0.5 PB EMBL Cloud-caching for Data Analysis 0.5 PB CERN E-Ternity 0.7 PB High Range Scenarios [6] Second copy of data for Disaster Recovery / DISASTER 0.3 PB / year Photon-Science/Working Group 1 PB BaBar Archive Stage 1 2 PB CERN Open Data Cloud Archive Services / CODCAS 2+ PB EMBL on Fire 20+ PB Photon Science/Collaboration 100 PB 10/04/2019 http://www.archiver-project.eu 13
  • 14. Retention Period 10/04/2019 http://www.archiver-project.eu 14 Type Deployment Scenario Name Retention Period Short Retention Period [2] Second copy of data for Disaster Recovery / DISASTER <5 years EMBL Cloud-caching for Data Analysis <5 years Medium Retention Period [8] Photon Science/Collaboration 10+ years Photon-Science/Working Group 10+ years Photon-Science/Scientist 10+ years BaBar Archive Stage 1 10 years DPHEP EOSC Science Demonstrator 10 years Analysis dataset server for gamma-ray astronomy / GAMMADAT 10+ years CERN Open Data Cloud Archive Services / CODCAS 5 - 10 years CERN E-Ternity 10+ years Long Retention Period [2] Open Data Publisher / OPENPUB 25+ years EMBL on Fire 25+ years
  • 15. Data Ingest Rates 10/04/2019 http://www.archiver-project.eu 15 Type Deployment Scenario Name Data Ingest Rates Low Rates [1] CERN E-Ternity 0.01 GB/s Medium Rates [3] CERN Open Data Cloud Archive Services / CODCAS 1 GB/s Photon-Science/Scientist 1-2 GB/s EMBL on Fire 1 – 2 GB/s High Rates [7] Second copy of data for Disaster Recovery / DISASTER 1 – 10 GB/s Photon-Science/Working Group 1-10 GB/s Analysis dataset server for gamma-ray astronomy / GAMMADAT 1 – 10 GB/s BaBar Archive Stage 1 1 – 10 GB/s EMBL Cloud-caching for Data Analysis 1 – 10 GB/s DPHEP EOSC Science Demonstrator 1 – 10 GB/s Open Data Publisher / OPENPUB 1 – 10 GB/s Very High Rates [1] Photon Science/Collaboration 8-20 GB/s
  • 18. 10/04/2019 http://www.archiver-project.eu 18 Summary and Next Steps The objective of ARCHIVER is to perform R&D to demonstrate functionality and performance of services for long-term preservation and archiving for scientific data in the PB range under F.A.I.R. principles, while ensuring that research groups will retain stewardship of their data sets ARCHIVER Pre-Commercial Procurement will run an open tender and the resulting services will be integrated on the EOSC catalogue and made broadly accessible to various organizations We welcome your feedback on the draft of the “Functional Specifications” document which will be released shortly after this event The Buyers group will co-design and co-develop with you a test plan - based on the outcome of the Design Phase, the Functional Specifications and the Deployment Scenarios The test assessment will be a deciding factor to qualify solutions to the subsequent phases The tests will focus on basic functionality capabilities during the prototype phase and performance, efficiency, and scalability during the pilot phase

Editor's Notes

  1. So enough with who I am let’s move on to the next important question. What is CERN? Do you guys know what CERN is? Do you know what the LHC is? No worries, if you don’t know, I am going to explain everything in the next slide.
  2. So enough with who I am let’s move on to the next important question. What is CERN? Do you guys know what CERN is? Do you know what the LHC is? No worries, if you don’t know, I am going to explain everything in the next slide.
  3. So enough with who I am let’s move on to the next important question. What is CERN? Do you guys know what CERN is? Do you know what the LHC is? No worries, if you don’t know, I am going to explain everything in the next slide.
  4. So enough with who I am let’s move on to the next important question. What is CERN? Do you guys know what CERN is? Do you know what the LHC is? No worries, if you don’t know, I am going to explain everything in the next slide.