Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Integrating Globus into LRZ's Data Science Storage Service

3 views

Published on

This presentation was given at the 2019 GlobusWorld Conference in Chicago, IL by Stephan Peinkofer from Leibniz Supercomputing Centre.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Integrating Globus into LRZ's Data Science Storage Service

  1. 1. 2 Integrating Globus into LRZ’s Data Science Storage Service GlobusWorld 2019 | 2019-05-01 | Stephan Peinkofer Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer
  2. 2. 3Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer Bavarian Academy of Sciences and Humanities Leibniz Supercomputing Centre Computer Centre for all Munich Universities250 employees approx. 57 years of IT support IT Service Backbone for the Advancement of Science and Research Regional Computer Centre for all Bavarian Universities National Supercomputing Centre (GCS) European Supercomputing Centre (PRACE)
  3. 3. High Performance Computing SuperMUC-NG, LRZ Linux Cluster Virtual Reality and Visualisation V2C (CAVE, Powerwall) 4 Operating Cutting-Edge IT Infrastructure LRZ as an IT Center of Excellence Storage Network Cloud Computing Cluster HPC Training Consultancy Email High Speed Networking Munich Scientific Network Big Data Bavarian State Library Digital Archive Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer
  4. 4. 5Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer Data Silos
  5. 5. Increasing User Demand 6 I need to share a 400TB dataset with someone in Canada! My experiment will generate multiple PBs, that have to be analyzed and backed up! How? I want to build a WebApp that allows users to interactively analyze my 500TB SuperMUC simulation data! I need to share some data on SuperMUC between multiple projects! I want to analyze a large dataset, generated on Super- MUC, using some special OS image on the LRZ Cloud! Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer
  6. 6. 7Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer Satisfying User Demands So basically we need to provide … A file system that can be shared amongst the complete LRZ HPC Ecosystem Some kind of external access mechanism for arbitrary entities A Dropbox like data management approach
  7. 7. LRZ Data Science Storage 8Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer Interactive processing on LRZ Compute Cloud Remote visualisation on LRZs visualisation systems External access and sharing via Globus Online High performance backup and archive of data on LRZs Backup- and Archive System Batch and interactive processing on dedicated, hosted HPC Cluster at LRZ High throughput batch processing on LRZs Linux Cluster or SuperMUC LRZ Data Science Storage
  8. 8. IBM Spectrum SCALE IBM Spectrum PROTECT LRZ Identity Managment System Globus Mission Control 9Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer DSSWeb Self Service Portal REST API CES Globus Sharing Globus Connect Server RabbitMQ Message Bus REST API Client Manage- ment Service REST API Operations Center REST API
  9. 9. The Big Picture 10Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer
  10. 10. Huber LMU User: lmuuser2 LinuxCluster SuperMUC Project: lxpr2 Project: smpr2 User: lx22bp User: sm33sx DSS Containers 11 Maier TUM User: tumuser1 LinuxCluster SuperMUC Project: lxpr1 Project: smpr1 User: lx11xc User: sm11bb DSS POSIX Group in IDM/LDAP pr45xa-dss-0000 DSS Container à GPFS Independent Fileset /dss/dssfs01/pr45xa-dss-0000 drwxrws--- root pr45xa-dss-0000 Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer
  11. 11. Technical Integration of Globus to LRZ DSS Goal 12 Integrate Globus Sharing to DSSWeb Self-Service Portal. Allow Data Curators to share DSS Containers with arbitrary external users. Problem Action Globus let’s us control. Who can share? What can be shared? We need to control. Who can share what? Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer
  12. 12. LRZ Data Science Storage Technical Integration of Globus to LRZ DSS 13 DSS Container X Container Group /dss/dssfs01/dsscontX DSS Container Directory Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer DSSWeb Globus Online LRZ MyProxy DSS Globus Endpoint 1. Enable Globus Sharing for DSS Container X Data Curator RobotUser aka RobotUser@globusid.org 2. Login to MyProxy to get Certificate 3. Enable DSS Globus Endpoint 4. Create Shared Endpoint “LRZ DSS Container X” LRZ DSS Container X Shared Endpoint 6. Add RobotUser to Container Access Group 5. Globus Magic
  13. 13. Technical Integration of Globus to LRZ DSS 14Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer DSSWeb 1. Invite bop@wherever.com to access DSS Container X via Globus Data Curator RobotUser aka RobotUser@globusid.org 2. Check if identity bop@wherever.com is already known by Globus and if not create it 3. Add Globus ACL for Shared Endpoint LRZ DSS Container X for identity bop@wherever.com 4. Globus Magic bop@wherever.com 5. Bop is happy LRZ Data Science Storage DSS Container X Container Group /dss/dssfs01/dsscontX DSS Container Directory DSS Globus Endpoint* LRZ DSS Container X Shared Endpoint Globus Online
  14. 14. Legal Integration of Globus to LRZ DSS Regulation 15 European Union enforced the EU General Data Protection Regulation (GDPR) on 2018-05- 25 Use/Integration of Cloud Services that process PII requires a formal Controller- Processor Agreement. Transfer of personal data to third countries requires special safeguards HIPPA and NIST rescue BAA to the rescue HIPPA and NIST require roughly similar technical and organizational security controls that are required by GDPR to protect PII Globus agreed to sign a Controller-Processor Agreement that contains the EU-Model Clauses Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer
  15. 15. 16Integrating Globus into LRZ’s Data Science Storage Service | 2019-05-01 | Stephan Peinkofer

×