The document describes several deployment scenarios for storing and processing scientific data from instruments like the MAGIC Telescopes.
It outlines scenarios for: 1) large file safe-keeping, 2) mixed file safe-keeping including smaller files and reprocessing outputs, 3) in-archive data processing, 4) distributing data to instrument analysts, and 5) external user access.
Challenges include meeting data transfer timelines, ensuring data integrity over long periods, and providing flexible access and metadata tools to support analysis and discovery. Commercial providers could offer solutions but managing trust and customization needs is also discussed.
1. PIC Deployment Scenarios
V. Acín, J. Casals, M. Delfino, J. Delgado
ARCHIVER Open Market Consultation event
London Stansted, May 23rd
2019
2. About PIC (scientific support)
● Port d’Informació Científica (PIC, the Scientific Information Harbour) is
Spain’s largest scientific data centre for Particle Physics and Astrophysics
● CERN’s Large Hadron Collider (LHC):
One of the 12 first-level (Tier-1) data processing centres.
● Imaging Atmospheric Cherenkov Telescopes: Custodial data centre for the
MAGIC Telescopes and the first Large Scale Telescope prototype for the
next-generation Cherenkov Telescope Array (CTA, an ESFRI landmark).
● Observational Cosmology: One of the 9 Science Data Centres for ESA’s
Euclid mission, custodial data centre for huge simulations of the Universe
expansion and the Physics of the Accelerating Universe (PAU) survey.
● Innovative “Big Data” platform for massive analysis of big datasets.
● 20 people, 50% engineers and 50% Ph.D.s (Comp. Sci, Physics, Chemistry)
2
3. About PIC (technical)
● 8500 x86 cores (mostly bare-metal, scheduled through HTCondor)
● 11 PiB disk (dCache) + 25 PiB tape (Enstore) with active HSM
● Overprovisioned 10Gbps LAN (moving to 100 Gbps next year)
● 2x10 Gbps WAN, optical paths to CERN and ORM
● Hadoop cluster for data analysis
○ 16 nodes, 2 TiB RAM, 192 TiB HDD
○ Prototyping with NVMe and NVDIMM
● GPU: proof of concept in training neural nets
● Heavily automated installation: puppet, Icinga, grafana, etc.
● Compact, highly energy efficient installation
3
5. Bottom up description of scenarios and workflows
● Actors:
○ Instrument (example used will be MAGIC Telescope in La Palma, Canary Islands, Spain)
○ Private Data Center (PIC near Barcelona, Spain)
○ Instrument Analysts (closed group of users)
○ External users (scientists not members of MAGIC, public access)
● Scenarios:
○ Large file safe-keeping
○ + Mixed-size file safe-keeping
○ + In-archive data processing
○ + Data distribution to Instrument Analysts
○ + Data utilization by External users
5
6. Large file safe-keeping workflow
6
Example:
MAGIC Telescopes
located at
Observatorio del
Roque de los
Muchachos, La
Palma, Canary
Islands
7. Large file safe-keeping workflow
10:00 Daily data available
18:00 Daily data safe off-telescope
500-1000 files @ 2 GB/file = 1-2 TB
RedIRIS 10 Gbps λ
7
Original service used shared 1 Gbps
general connection ORM-RedIRIS
10 Gbps λ implemented to ensure
compliance with the 8-hour window
8. Large file safe-keeping workflow
10:00 Daily data available
18:00 Daily data safe off-telescope
500-1000 files @ 2 GB/file = 1-2 TB
RedIRIS 10 Gbps λ
Data characteristics:
Inmutable (read-only)
Binary private format
Single bit error in a file renders it useless
Two metadata items: filename, checksum
8
9. Large file safe-keeping workflow
10:00 Daily data available
18:00 Daily data safe off-telescope
500-1000 files @ 2 GB/file = 1-2 TB
RedIRIS 10 Gbps λ
Data characteristics:
Inmutable (read-only)
Binary private format
Single bit error in a file renders it useless
Two metadata items: filename, checksum
Data stewardship:
Year 1: Data accumulates: 150k 2 GB files = 300 TB
Years 1-6: Data are bit-preserved
Random time(s) in years 2-6:
Full 300 TB recalled to disk and reprocessed
9
10. Large file safe-keeping workflow
10:00 Daily data available
18:00 Daily data safe off-telescope
500-1000 files @ 2 GB/file = 1-2 TB
RedIRIS 10 Gbps λ
Data characteristics:
Inmutable (read-only)
Binary private format
Single bit error in a file renders it useless
Two metadata items: filename, checksum
Data stewardship:
Year 1: Data accumulates: 150k 2 GB files = 300 TB
Years 1-6: Data are bit-preserved
Random time(s) in years 2-6:
Full 300 TB recalled to disk and reprocessed
Challenges:
365 days/year x 8 hour time window to complete data transfer, storage and verification
Non-predictable full recall must be accomplished in 30 days or less. (Mitigation: Advance notification)
Cost expectation: compatible with telescope maintenance costs → < 30k euros per 300 TB stored 5 yrs
10
13. Some motivations for moving to commercial service
13
● Cherenkov Telescope Array uses OAIS as the basis for their archive
● Pressure to focus on Layer 3 and 4 services
● Cost evolution and hyper-scaling
● Limited physical space on campus
● Disaster recovery
● Uncertainties on availability of tape equipment for on-premises installation
But there is also a list of motivations NOT to move to a commercial service !!!!!
● Main item: Distrust of commercial services
14. Commercial safe-keeping deployment scenario
Daily: 500-1000 2 GB files
RedIRIS 10 Gbps λ
Commercial
Provider
RedIRIS+Géant
2-week notice
Full recall complete in 30 days
Interface:
put/get with full error recovery + status check
CLI + scriptable + programmable
Secure with one expert user.
Any reasonable AA method compatible with interfacing requirements.
300 TB
kept for
5 years
14
15. Commercial safe-keeping deployment scenario
Daily: 500-1000 2 GB files
Commercial
Provider
2-week notice
Full recall complete in 30 days
Interface:
put/get with full error recovery + status check
CLI + scriptable + programmable
Secure with one expert user.
Any reasonable AA method compatible with interfacing requirements.
300 TB
kept for
5 years
15
RedIRIS+Géant
16. Commercial safe-keeping deployment scenario
Daily: 500-1000 2 GB files
RedIRIS 10 Gbps λ
Commercial
Provider
Scrubbing: Every file
re-read and checksummed
once per year without buyer
intervention
RedIRIS+Géant
RedIRIS+Géant
2-week notice
Full recall complete in 30 days
2-week notice
Heartbeat: Random
1% sample recalled monthly
Future: Trust through OAIS/ISO?
Interface:
put/get with full error recovery + status check
CLI + scriptable + programmable
Secure with one expert user. Any reasonable AA
method compatible with interfacing requirements.
300 TB
kept for
5 years
16
17. Mixed file safe-keeping workflow and scenario
Commercial Provider
Scrubbing
RedIRIS+Géant
Full recall complete in 30 days
Heartbeat
300 TB kept for 5 years
+ < 100 TB a posteriori
Daily: 500-1000 2 GB files
+ 1500-3000 <200 MB files
2-week notice
17
18. Mixed file safe-keeping workflow and scenario
Commercial Provider
Scrubbing
RedIRIS+Géant
Full recall complete in 30 days
Heartbeat
300 TB kept for 5 years
+ < 100 TB a posteriori
Daily: 500-1000 2 GB files
+ 1500-3000 <200 MB files
2-week notice
50 TB reprocessed output in 30 days
Additional metadata tag: version
18
19. Mixed file safe-keeping workflow and scenario
Commercial Provider
Scrubbing
RedIRIS+Géant
Full recall complete in 30 days
Heartbeat
300 TB kept for 5 years
+ < 100 TB a posteriori
Daily: 500-1000 2 GB files
+ 1500-3000 <200 MB files
2-week notice
50 TB reprocessed output in 30 days
Challenges:
put/get directly by reprocessing workflow @PIC
150k files input gives 450k files output to be stored
Cost compatible with maintenance costs:
< 40k€ per 300 TB stored 5 yrs (v1)+7.5k€ per reprocess
Additional metadata tag: version
19
20. + In-archive processing scenario workflow/scenario
Commercial Provider
Scrubbing
RedIRIS+Géant
Full recall complete in 30 days
Heartbeat
300 TB kept for 5 years
+ < 100 TB a posteriori
Daily: 500-1000 2 GB files
+ 1500-3000 <200 MB files
2-week notice
50 TB reprocessed output in 30 days
Additional metadata tag: version
Commercial provider
in-archive processing
20
21. + In-archive processing scenario workflow/scenario
Commercial Provider
Scrubbing
RedIRIS+Géant
Full recall complete in 30 days
Heartbeat
300 TB kept for 5 years
+ < 100 TB a posteriori
Daily: 500-1000 2 GB files
+ 1500-3000 <200 MB files
2-week notice
50 TB reprocessed output in 30 days
Challenges:
put/get directly by reprocessing workflow @PIC
150k files input gives 450k files output to be stored
Cost compatible with maintenance costs:
< 40k€ per 300 TB stored 5 yrs (v1)+7.5k€ per reprocess
+ competitive price for CPU with appropriate I/O
Additional metadata tag: version
Commercial provider
in-archive processing
21
22. Data distribution to Instrument Analysts
Commercial Provider
Scrubbing
Heartbeat
300 TB kept for 5 years
+ < 100 TB a posteriori
+ metadata handling
Metadata produced in origin
Extensible
metadata
generated
by experts
22
23. Data distribution to Instrument Analysts
Commercial Provider
Scrubbing
Heartbeat
300 TB kept for 5 years
+ < 100 TB a posteriori
+ metadata handling
Metadata query
Download subset of files
XXX AAI
(AD@Azure)
MAGIC AAI
(ldap@PIC)
Optional addtl. methods
mount+file system emulation
Selective sync-and-share
Metadata produced in origin
Worldwide
users
Extensible
metadata
generated
by experts
23
24. Data distribution to Instrument Analysts
Commercial Provider
Scrubbing
Heartbeat
300 TB kept for 5 years
+ < 100 TB a posteriori
+ metadata handling
Metadata query
Download subset of files
XXX AAI
(AD@Azure)
MAGIC AAI
(ldap@PIC)
Optional addtl. methods
mount+file system emulation
Selective sync-and-share
Metadata produced in origin
Worldwide
users
Extensible
metadata
generated
by experts
24
Challenges:
Interface to multiple, existing, external AA systems + create ACL-type environment
Data must be “online” - “Raw” data component could be excluded
Provide extensible metadata handling system and drive file access by metadata queries
Cost compatible with maintenance costs: < 60 k€ for 5 years of v1 service
26. Extension to External Users
● From other scientific projects
○ Add additional AAI providers and use group management tools (Co-manage or Grouper)
○ If too many AAI providers, still use group management tools and
■ Move to edugain or
■ Move to ORCID
● “Open” data
○ Open ≠ uncontrolled
○ Need to know who accessed data
■ Citation control
■ Statistical information to demonstrate value
● Most likely both will need In-archive analysis / viewing tools
Work in progress
26
27. 1-year MAGIC Telescope as example. Others...
● Cherenkov Telescope Array will have two sites (Northern and Southern
hemispheres) with 10 large telescopes and 100s of smaller ones
● Studies of the expansion of the Universe with Optical Telescopes which
produce data from one-week campaigns to 365 days/year
● Supercomputer simulation production can look a lot like an instrument that
produces a lot of data during a short time
● High volume applications such as High Luminosity LHC
● etc…
27
28. Bottom up description of scenarios and workflows
● Actors:
○ Instrument (example used will be MAGIC Telescope in La Palma, Canary Islands, Spain)
○ Private Data Center (PIC near Barcelona, Spain)
○ Instrument Analysts (closed group of users)
○ External users (scientists not members of MAGIC, public access)
● Scenarios:
○ Large file safe-keeping
○ + Mixed-size file safe-keeping
○ + In-archive data processing
○ + Data distribution to Instrument Analysts
○ + Data utilization by External users
28