Presentació conjunta a càrrec de Ricard de la Vega (CSUC), Nadia Tonello (BSC), Javier Cacheiro (CESGA) i Vanessa Acin (PIC) duta a terme la 15a edició de les jornades d'usuaris de la Red Española de Supercomputación (RES), celebrades els dies 16 i 17 de setembre de 2021.
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
RES data projects report 2021
1. RES data projects
report 2021
Nadia Tonello – BSC
Javier Cacheiro – CESGA
Ricard de la Vega – CSUC
Vanessa Acin - PIC
17/09/2021 JU-RES technical session
Access: bsc.es/res-intranet
2. Outline
• Call for data projects
• Publication of awarded resources
• First technicalreport
• Future work before next call
• Questions and comments
Access: bsc.es/res-intranet
4. First RES data projects call
• New experience, learning by doing
• Most candidate projects were nodes contacts
• Submission form: kept very open, with manyquestions
• Evaluation procedure: manual (no web tools)
• Publication of awarded resources
• 4 data nodes involved
• +2 backup nodes
Access: bsc.es/res-intranet
6. CESGA
• Transfer service: DTN with Globus Online,
GridFTP and Aspera
Access: bsc.es/res-intranet
Storage Computing Services
Project Duration Status Awarded
2021 (TB)
Used
(Sept’21)
Awarded Used
(Sept’21)
Uncovering cancer evolution
through high-throughput DNA
sequencing
60M Enabled 150 140 20k CPU
hours/year
Dissemination of the Uchuu
cosmological simulation
60M Enabled 109 109 2 VMs Hadoop
Spark
JupyterLab
Exploring the genomics of
cancer
120M Enabled 900 52 20k CPU
hours/year
HERCULES: High vertical
Resolution Climate
Simulations Dataset
60M Enabled 100 50
7. CSUC
Storage Computing
Project Duration Status Awarded
2021 (TB)
Used
(1stSept’21)
Awarded Used
(Sept’21)
Bat monitoring 36M Enabling 90 - 2 VMs -
• Redefining the project workflow to take advantage of
the new infraestructure
Access: bsc.es/res-intranet
8. Storage Computing
Project Duration Awarded Used (Sept’21) Awarded Used (Sept’21)
ATLAS IFAE publication data 60M 200 TB disk +
200 TB tape
90 TB disk 100
kcpuhrs/yr
130 kcpuhrs
PAU survey 36M 150 TB disk +
150 TB tape
20 TB disk and 90
TB tape
50
kcpuhrs/yr
258 kcpuhrs
MAGIC Data Legacy 60M 200 TB disk +
200 TB tape
7 TB disk
CMS data cache 60M 200 TB disk 100 TB disk
Microscopy data management 60M 354 TB disk +
354 TB tape
3 TB disk
PIC
Access: bsc.es/res-intranet
• CPU used for data analysis aboveoriginal estimation.Not a big problem to adapt to needs.
• Data transfers:
• Mostusers usepre-existing Systems based on GridFTP/http/xrootd - OK
• Microscopy user required new data transfer path (ICFO-PIC): non-trivialsetup process.
9. BSC
Storage Computing Services
Project Duration Status Awarded
2021 (TB)
Used
(1stSept’21)
Awarded Used
(Sept’21)
Storage
backup node
Federated EGA 48M Stand-by 300 2 VMs IAC
ESGF data node 48M Enabling 1000 3 VMs migrating
Gaia 60M Upgrading 800 2 VMs migrating
ioChem-BD 36M Upgrading 22 2 VMs migrating
300 GC DB 48M Enabling 260 47.6 200k CPU h/y
Fungal pathogens 58M Enabling 50 2 VMs testing SCAYLE
Nuclear compartments 36M Enabling 24
BioExcel-CV19 36M Enabling 500 2.6
• DMP management using DSW
• Transfer service in preparation(for 3 PB data traffic)
• Performance Indicatorsdefined with the projects
Access: bsc.es/res-intranet
11. Work in progress
• New web section dedicated to data calls and projects management
• Update of the application form with clearer questions, for applicants
and evaluators
• New evaluation/report system (transparency)
Actions for all nodes
Access: bsc.es/res-intranet
• Inform data big users about possibility to apply to RES data calls
• Prepare available resources for next call
• Possibility to have backup storage in another node
• Increase the collaboration between nodes for DM actions