Pic archiver stansted

PIC Deployment Scenarios
V. Acín, J. Casals, M. Delfino, J. Delgado
ARCHIVER Open Market Consultation event
London Stansted, May 23rd
2019

About PIC (scientific support)
● Port d’Informació Científica (PIC, the Scientific Information Harbour) is
Spain’s largest scientific data centre for Particle Physics and Astrophysics
● CERN’s Large Hadron Collider (LHC):
One of the 12 first-level (Tier-1) data processing centres.
● Imaging Atmospheric Cherenkov Telescopes: Custodial data centre for the
MAGIC Telescopes and the first Large Scale Telescope prototype for the
next-generation Cherenkov Telescope Array (CTA, an ESFRI landmark).
● Observational Cosmology: One of the 9 Science Data Centres for ESA’s
Euclid mission, custodial data centre for huge simulations of the Universe
expansion and the Physics of the Accelerating Universe (PAU) survey.
● Innovative “Big Data” platform for massive analysis of big datasets.
● 20 people, 50% engineers and 50% Ph.D.s (Comp. Sci, Physics, Chemistry)
2

About PIC (technical)
● 8500 x86 cores (mostly bare-metal, scheduled through HTCondor)
● 11 PiB disk (dCache) + 25 PiB tape (Enstore) with active HSM
● Overprovisioned 10Gbps LAN (moving to 100 Gbps next year)
● 2x10 Gbps WAN, optical paths to CERN and ORM
● Hadoop cluster for data analysis
○ 16 nodes, 2 TiB RAM, 192 TiB HDD
○ Prototyping with NVMe and NVDIMM
● GPU: proof of concept in training neural nets
● Heavily automated installation: puppet, Icinga, grafana, etc.
● Compact, highly energy efficient installation
3

PIC’s liquid immersion cooling installation
4

Bottom up description of scenarios and workflows
● Actors:
○ Instrument (example used will be MAGIC Telescope in La Palma, Canary Islands, Spain)
○ Private Data Center (PIC near Barcelona, Spain)
○ Instrument Analysts (closed group of users)
○ External users (scientists not members of MAGIC, public access)
● Scenarios:
○ Large file safe-keeping
○ + Mixed-size file safe-keeping
○ + In-archive data processing
○ + Data distribution to Instrument Analysts
○ + Data utilization by External users
5

Large file safe-keeping workflow
6
Example:
MAGIC Telescopes
located at
Observatorio del
Roque de los
Muchachos, La
Palma, Canary
Islands

10:00 Daily data available
18:00 Daily data safe off-telescope
500-1000 files @ 2 GB/file = 1-2 TB
RedIRIS 10 Gbps λ
7
Original service used shared 1 Gbps
general connection ORM-RedIRIS
10 Gbps λ implemented to ensure
compliance with the 8-hour window

500-1000 files @ 2 GB/file = 1-2 TB
RedIRIS 10 Gbps λ
Data characteristics:
Inmutable (read-only)
Binary private format
Single bit error in a file renders it useless
Two metadata items: filename, checksum
8

500-1000 files @ 2 GB/file = 1-2 TB
RedIRIS 10 Gbps λ
Data stewardship:
Year 1: Data accumulates: 150k 2 GB files = 300 TB
Years 1-6: Data are bit-preserved
Random time(s) in years 2-6:
Full 300 TB recalled to disk and reprocessed
9

500-1000 files @ 2 GB/file = 1-2 TB
RedIRIS 10 Gbps λ
Data stewardship:
Year 1: Data accumulates: 150k 2 GB files = 300 TB
Years 1-6: Data are bit-preserved
Random time(s) in years 2-6:
Full 300 TB recalled to disk and reprocessed
Challenges:
365 days/year x 8 hour time window to complete data transfer, storage and verification
Non-predictable full recall must be accomplished in 30 days or less. (Mitigation: Advance notification)
Cost expectation: compatible with telescope maintenance costs → < 30k euros per 300 TB stored 5 yrs
10

Some motivations for moving to commercial service
11

12

13
● Cherenkov Telescope Array uses OAIS as the basis for their archive
● Pressure to focus on Layer 3 and 4 services
● Cost evolution and hyper-scaling
● Limited physical space on campus
● Disaster recovery
● Uncertainties on availability of tape equipment for on-premises installation
But there is also a list of motivations NOT to move to a commercial service !!!!!
● Main item: Distrust of commercial services

Commercial safe-keeping deployment scenario
Daily: 500-1000 2 GB files
RedIRIS 10 Gbps λ
Commercial
Provider
RedIRIS+Géant
2-week notice
Full recall complete in 30 days
Interface:
put/get with full error recovery + status check
CLI + scriptable + programmable
Secure with one expert user.
Any reasonable AA method compatible with interfacing requirements.
300 TB
kept for
5 years
14

Commercial
Provider
2-week notice
Interface:
Secure with one expert user.
Any reasonable AA method compatible with interfacing requirements.
300 TB
kept for
5 years
15
RedIRIS+Géant

RedIRIS 10 Gbps λ
Commercial
Provider
Scrubbing: Every file
re-read and checksummed
once per year without buyer
intervention
RedIRIS+Géant
RedIRIS+Géant
2-week notice
2-week notice
Heartbeat: Random
1% sample recalled monthly
Future: Trust through OAIS/ISO?
Interface:
Secure with one expert user. Any reasonable AA
method compatible with interfacing requirements.
300 TB
kept for
5 years
16

Mixed file safe-keeping workflow and scenario
Commercial Provider
Scrubbing
RedIRIS+Géant
Heartbeat
300 TB kept for 5 years
+ < 100 TB a posteriori
+ 1500-3000 <200 MB files
2-week notice
17

Commercial Provider
Scrubbing
RedIRIS+Géant
Heartbeat
+ 1500-3000 <200 MB files
2-week notice
50 TB reprocessed output in 30 days
Additional metadata tag: version
18

Commercial Provider
Scrubbing
RedIRIS+Géant
Heartbeat
+ 1500-3000 <200 MB files
2-week notice
Challenges:
put/get directly by reprocessing workflow @PIC
150k files input gives 450k files output to be stored
Cost compatible with maintenance costs:
< 40k€ per 300 TB stored 5 yrs (v1)+7.5k€ per reprocess
19

+ In-archive processing scenario workflow/scenario
Commercial Provider
Scrubbing
RedIRIS+Géant
Heartbeat
+ 1500-3000 <200 MB files
2-week notice
Commercial provider
in-archive processing
20

+ In-archive processing scenario workflow/scenario
Commercial Provider
Scrubbing
RedIRIS+Géant
Heartbeat
+ 1500-3000 <200 MB files
2-week notice
Challenges:
put/get directly by reprocessing workflow @PIC
150k files input gives 450k files output to be stored
Cost compatible with maintenance costs:
< 40k€ per 300 TB stored 5 yrs (v1)+7.5k€ per reprocess
+ competitive price for CPU with appropriate I/O
Commercial provider
in-archive processing
21

Data distribution to Instrument Analysts
Commercial Provider
Scrubbing
Heartbeat
+ metadata handling
Metadata produced in origin
Extensible
metadata
generated
by experts
22

Commercial Provider
Scrubbing
Heartbeat
+ metadata handling
Metadata query
Download subset of files
XXX AAI
(AD@Azure)
MAGIC AAI
(ldap@PIC)
Optional addtl. methods
mount+file system emulation
Selective sync-and-share
Worldwide
users
Extensible
metadata
generated
by experts
23

Commercial Provider
Scrubbing
Heartbeat
+ metadata handling
Metadata query
Download subset of files
XXX AAI
(AD@Azure)
MAGIC AAI
(ldap@PIC)
Optional addtl. methods
mount+file system emulation
Selective sync-and-share
Worldwide
users
Extensible
metadata
generated
by experts
24
Challenges:
Interface to multiple, existing, external AA systems + create ACL-type environment
Data must be “online” - “Raw” data component could be excluded
Provide extensible metadata handling system and drive file access by metadata queries
Cost compatible with maintenance costs: < 60 k€ for 5 years of v1 service

What in-archive data analysis and presentation may look like
25

Extension to External Users
● From other scientific projects
○ Add additional AAI providers and use group management tools (Co-manage or Grouper)
○ If too many AAI providers, still use group management tools and
■ Move to edugain or
■ Move to ORCID
● “Open” data
○ Open ≠ uncontrolled
○ Need to know who accessed data
■ Citation control
■ Statistical information to demonstrate value
● Most likely both will need In-archive analysis / viewing tools
Work in progress
26

1-year MAGIC Telescope as example. Others...
● Cherenkov Telescope Array will have two sites (Northern and Southern
hemispheres) with 10 large telescopes and 100s of smaller ones
● Studies of the expansion of the Universe with Optical Telescopes which
produce data from one-week campaigns to 365 days/year
● Supercomputer simulation production can look a lot like an instrument that
produces a lot of data during a short time
● High volume applications such as High Luminosity LHC
● etc…
27

Bottom up description of scenarios and workflows
● Actors:
○ Instrument (example used will be MAGIC Telescope in La Palma, Canary Islands, Spain)
○ Private Data Center (PIC near Barcelona, Spain)
○ Instrument Analysts (closed group of users)
○ External users (scientists not members of MAGIC, public access)
● Scenarios:
○ Large file safe-keeping
○ + Mixed-size file safe-keeping
○ + In-archive data processing
○ + Data distribution to Instrument Analysts
○ + Data utilization by External users
28

29
Helping to turn Information into Knowledge

Pic archiver stansted

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Pic archiver stansted

Similar to Pic archiver stansted (20)

More from Archiver

More from Archiver (20)

Recently uploaded

Recently uploaded (20)

Pic archiver stansted