Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
European Molecular Biology Laboratory European Bioinformatics Institute The home for big data in biology
What is EMBL-EBI? Europe’s home for biological data services, research and training A trusted data provider for the life s...
EMBL member states Austria, Belgium, Croatia, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Iceland,...
Our mission Deliver excellent research Train the next generation of scientists Engage with industry Coordinate bioinformat...
The European Molecular Biology Laboratory Heidelberg, Germany Main Laboratory Barcelona, Spain Tissue Biology, Disease Mod...
Data resources at EMBL-EBI
Database interactions • Data exchange between EBI data resources • Arc width weighted by the number of diﬀerent data types...
Data volume doubles every two years • => half of our data will always be < 2 years old EGA and ENA account for the bulk of...
And is getting cheaper to produce $100M $1M $10K $100 2001 2020 Cost per Human Genome Moore’s Law
See the live map at www.ebi.ac.uk/about/our-impact
EMBL-EBI
Current Scientific Data Repositories ARCHIVER “current state of the art” report: https://doi.org/10.5281/zenodo.3618215
EMBL1–FIRE PIC2–MixFileStorage DESY1–IndividualScientist CERN2–CERNOpenData CERN3–CERNDigitalMemory CERN1–TheBaBarExperime...
https://archiver-project.eu/early-adopters-programme
European Open Science Cloud (EOSC) 22 slide courtesy of Bob Jones (EOSC Sustainability Working Group, CERN)
23slide courtesy of Bob Jones (EOSC Sustainability Working Group, CERN)
slide courtesy of Bob Jones (EOSC Sustainability Working Group, CERN)
• • Analysis results considered in the competitive R&D tender • Technical and organisational measures aligned with Europea...
• • • • •
Page 36| DESY – Archiver use case overview | S. Yakubov, M. Gasthuber | 08/06/2020 | Main sources of data to be archived a...
Page 37 automation scale - #objects, volume, bandwidth individual scientist / small working groups mid-size working groups...
● ○ ●
● ● ● ○ ○ ○ ○ ●
…
● ● ● ○ ● ○
Data volume doubles every two years • => half of our data will always be < 2 years old1 PB 1 TB 1 GB 2004 2019
… • • • • • … … • • • • •
… • • • •
• • • • • • • • • • •
● → ○ ○ ○ → ● → ● ○ ● → ● ○ ● ○ ○ ●
Early Adopters Programme
WHAT? WHY? HOW? Early adopters Programme
WHAT? Public sector & not-for-proﬁt organisations interested in the ARCHIVER PCP Help to shape the R&D Test the solutions ...
WHY? Becoming an Early Adopter means: Be consulted during the preparation of future ARCHIVER phases Access material produc...
Sign a declaration of conﬁdentiality and non-conﬂict of interest, stating that your organisation will not submit a bid in ...
The Early Adopters engaged so far
Archival and accessibility of omics data Archiving Genomic and Imaging Data Multi-Repository Research Data Harvester and T...
HOW? Are you part of a public sector research organisation with needs for standards-based, cost-effective data archiving a...
Do you want to know more about the Early Adopters Programme? https://archiver-project.eu/early-adopters-programme
• • •
• •
• • • •
▪ ▪
▪ ▪ ▪ ▪ ▪ ▪ ▪
▪ ▪ ▪
ARCHIVER Arkivum and Google solution Phase 2: Prototype
Arkivum Perpetua: Cloud Hosted Digital Preservation and Archiving
Submit & Validate Preserve & Safeguard Discovery & Access Consumers (Content Destinations) Producers (Content Sources) Exp...
• Scalable storage and compute • High speed ingest and access • Policy based cost optimization • OAIS workflows and packag...
Google Cloud Platform: PB Scale Storage, Compute and Networking
• Deployment in GCP, on-premise and hybrid cloud • Portable to other cloud providers • Kubernetes, containers, Anthos, aut...
Pilot: Long Term Digital Preservation Hosted On GCP
Prototype: Factories for LTDP in Large Scale Science
Prototype: Approach • Automation, Scalability and Efficiency: Preservation Factories • Minimal Effort Ingest / Minimal Via...
Thank you https://www.archiver-project.eu/
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Prototype Phase Kick-off Event and Ceremony
Upcoming SlideShare
Loading in …5
×

Prototype Phase Kick-off Event and Ceremony

6 views

Published on

On Monday 7 December 2020, the selected consortia for the ARCHIVER prototype phase have been announced during a Public Award Ceremony.
The Kick-off marks the beginning of the Prototype implementation Phase, where the three selected to move forward will to build prototypes of their solutions including all components, and basic functionality, interoperability, and security tests will be performed by IT specialists from the buyers’ group.

Published in: Technology
no profile picture user

  • Be the first to comment

  • Be the first to like this

Prototype Phase Kick-off Event and Ceremony

  1. 1. European Molecular Biology Laboratory European Bioinformatics Institute The home for big data in biology
  2. 2. What is EMBL-EBI? Europe’s home for biological data services, research and training A trusted data provider for the life sciences Part of the European Molecular Biology Laboratory, an intergovernmental research organisation International: 650 members of staﬀ from 66 nations
  3. 3. EMBL member states Austria, Belgium, Croatia, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Israel, Italy, Luxembourg, Malta, Montenegro, the Netherlands, Norway, Portugal, Slovakia, Spain, Sweden, Switzerland and the United Kingdom Associate member states: Argentina, Australia Prospect member states: Lithuania, Poland
  4. 4. Our mission Deliver excellent research Train the next generation of scientists Engage with industry Coordinate bioinformatics in Europe Deliver scientiﬁc services
  5. 5. The European Molecular Biology Laboratory Heidelberg, Germany Main Laboratory Barcelona, Spain Tissue Biology, Disease Modeling 80+ nationalities Hinxton, Cambridge, UK Bioinformatics Mouse Biology Rome, Italy >1700 personnel Grenoble, France Hamburg, Germany Structural Biology 6 sites in Europe Structural Biology
  6. 6. Data resources at EMBL-EBI
  7. 7. Database interactions • Data exchange between EBI data resources • Arc width weighted by the number of diﬀerent data types exchanged
  8. 8. Data volume doubles every two years • => half of our data will always be < 2 years old EGA and ENA account for the bulk of the data • DNA sequences BioImaging repository • Just starting, will be big 1 PB 1 TB 1 GB 2004 2019
  9. 9. And is getting cheaper to produce $100M $1M $10K $100 2001 2020 Cost per Human Genome Moore’s Law
  10. 10. See the live map at www.ebi.ac.uk/about/our-impact
  11. 11. EMBL-EBI
  12. 12. Current Scientific Data Repositories ARCHIVER “current state of the art” report: https://doi.org/10.5281/zenodo.3618215
  13. 13. EMBL1–FIRE PIC2–MixFileStorage DESY1–IndividualScientist CERN2–CERNOpenData CERN3–CERNDigitalMemory CERN1–TheBaBarExperiment PIC3–DataDistribution EMBL2–CloudCaching PIC1–LargeFileStorage DESY2–PetraIIIExperiment DESY3–EUXFELExperiment
  14. 14. https://archiver-project.eu/early-adopters-programme
  15. 15. European Open Science Cloud (EOSC) 22 slide courtesy of Bob Jones (EOSC Sustainability Working Group, CERN)
  16. 16. 23slide courtesy of Bob Jones (EOSC Sustainability Working Group, CERN)
  17. 17. slide courtesy of Bob Jones (EOSC Sustainability Working Group, CERN)
  18. 18. • • Analysis results considered in the competitive R&D tender • Technical and organisational measures aligned with European legislation in the services being developed (by default & by design) • • Additional use cases expanding further the set of supported scientific domains • Publicly funded research actors external to the ARCHIVER consortium • • For consortium members and Early Adopter organisations • Beyond the lifetime of the project ARCHIVER is the only EOSC related H2020 project focusing on Archiving & Long Term Data Preservation services for PetaByte scale datasets across multiple research domains and countries.
  19. 19. • • • • •
  20. 20. Page 36| DESY – Archiver use case overview | S. Yakubov, M. Gasthuber | 08/06/2020 | Main sources of data to be archived and preserved >30PB annual 2-4PB annual ● two sites ○ Hamburg ○ Zeuthen (near Berlin) ● science areas ○ particle physics (LHC, Belle 2, …) ○ photon science (EuXFEL, Petra III, FLASH) ○ accelerator research (wakefield, Petra IV, …) ○ astrophysics ● all areas “data intensive science”
  21. 21. Page 37 automation scale - #objects, volume, bandwidth individual scientist / small working groups mid-size working groups (Petra III experiment) • scientist is the archivist • publication material + condensed data + reference to full datasets • DOI handling • mainly interactive access • few TB, 100MB/sec, 10K objects • ~0.2-0.5PB annual • more or less ‘classical preservation model/practices’ Archiver challenges large collaboration / site management (EuXFEL organization) • nominated member of the group is the archivist (on behalf of) • raw + derived data + code • DOI + open-data handling • comply with site data policy • few 10TB, 1-2GB/sec, >150K objects • <50% interactive access • ~2-4PB annual • site nominated archivist responsible for all experiments • raw + calibration data + code • DOI + open-data handling • comply with site data policy • few 100TB, 2-10GB/sec, >30K obj. • very low interactive access • >30PB annual API/CLI usage / less interactive
  22. 22. ● ○ ●
  23. 23. ● ● ● ○ ○ ○ ○ ●
  24. 24.
  25. 25. ● ● ● ○ ● ○
  26. 26. Data volume doubles every two years • => half of our data will always be < 2 years old1 PB 1 TB 1 GB 2004 2019
  27. 27. … • • • • • … … • • • • •
  28. 28. … • • • •
  29. 29. • • • • • • • • • • •
  30. 30. ● → ○ ○ ○ → ● → ● ○ ● → ● ○ ● ○ ○ ●
  31. 31. Early Adopters Programme
  32. 32. WHAT? WHY? HOW? Early adopters Programme
  33. 33. WHAT? Public sector & not-for-proﬁt organisations interested in the ARCHIVER PCP Help to shape the R&D Test the solutions developed Potential to purchasing pilot-scale services
  34. 34. WHY? Becoming an Early Adopter means: Be consulted during the preparation of future ARCHIVER phases Access material produced by the project Propose your own use cases and get the chance to test resulting services Beneﬁt from training sessions covering the services developed during the ARCHIVER project Accelerate the procurement process of pilot-scale services & have certain conditions
  35. 35. Sign a declaration of conﬁdentiality and non-conﬂict of interest, stating that your organisation will not submit a bid in response to the ARCHIVER Request for Tender Allow the ARCHIVER Buyers Group to list your organisation’s name in its Request for Tenders and subsequent Call-offs In case of engagement in testing activities, describe the use case(s) to potentially test using the ARCHIVER services and to provide structured feedback on the testing results to the ARCHIVER project Acknowledge the support of the European Commission and ARCHIVER project in any publications that result from the aforementioned testing activities performed with the developed services. What are the obligations as an Early Adopter of ARCHIVER?
  36. 36. The Early Adopters engaged so far
  37. 37. Archival and accessibility of omics data Archiving Genomic and Imaging Data Multi-Repository Research Data Harvester and Transformer for Swedish Archival Standard Preserving Australia’s digital research, education and cultural heritage Deﬁning National Scale Data Archive Services Use cases https://archiver-project.eu/early-adopters-use-cases
  38. 38. HOW? Are you part of a public sector research organisation with needs for standards-based, cost-effective data archiving and preservation services? Are high ingest rates, data volumes at scale and long-term support important to you? Express your interest SCAN ME
  39. 39. Do you want to know more about the Early Adopters Programme? https://archiver-project.eu/early-adopters-programme
  40. 40. • • •
  41. 41. • •
  42. 42. • • • •
  43. 43. ▪ ▪
  44. 44. ▪ ▪ ▪ ▪ ▪ ▪ ▪
  45. 45. ▪ ▪ ▪
  46. 46. ARCHIVER Arkivum and Google solution Phase 2: Prototype
  47. 47. Arkivum Perpetua: Cloud Hosted Digital Preservation and Archiving
  48. 48. Submit & Validate Preserve & Safeguard Discovery & Access Consumers (Content Destinations) Producers (Content Sources) Experiments Labs Repositories Local Servers Service Providers Transfer Checks & Validation Metadata Extraction Ingestion & Organisation Retention Management File Format Identification Characterisatio n Validation Normalisation Packaging (AIP/DIP) Index & Dedupe Search & Navigate View Secure Export Publish Staff Researcher Collaborators Public Media
  49. 49. • Scalable storage and compute • High speed ingest and access • Policy based cost optimization • OAIS workflows and packages • Digital Preservation rules and actions • FAIR datasets and access • Hosted scientific applications • Open standards and specifications • Exit and migration strategies Arkivum / Google Solution:
  50. 50. Google Cloud Platform: PB Scale Storage, Compute and Networking
  51. 51. • Deployment in GCP, on-premise and hybrid cloud • Portable to other cloud providers • Kubernetes, containers, Anthos, automated deployment • Exit strategies using data escrow, open standards and fast exports Prototype: Portability and Exit Strategies Portland Common Data Model
  52. 52. Pilot: Long Term Digital Preservation Hosted On GCP
  53. 53. Prototype: Factories for LTDP in Large Scale Science
  54. 54. Prototype: Approach • Automation, Scalability and Efficiency: Preservation Factories • Minimal Effort Ingest / Minimal Viable Preservation • Dataset Authenticity, Integrity and Usability: FAIR • Platform for building Trusted Digital Repositories • Fully SaaS on GCP, but also portable to on-premise and hybrid deployments
  55. 55. Thank you https://www.archiver-project.eu/

×