SlideShare a Scribd company logo
1 of 19
Download to read offline
Rachana Ananthakrishnan
ranantha@uchicago.edu
February 28, 2024
Best Practices for Data Sharing
Secure data sharing …from any storage
Collaborator logs into Globus
and accesses shared files;
no local account required;
download via Globus
2
On-prem or
public cloud
storage
Select files to share,
select user or group,
and set access
permissions
1
Globally accessible
multi-tenant service
Globus controls
access to shared files
on existing storage
Laptop, server,
compute facility
• Fine-grained access
control “overlay” on
storage system
• Share with any identity,
email, group
• No need to stage data just
for sharing
v
Guest collections
• Directly addressable entities
• Bulk data access (via Globus transfer service)
• HTTP/S access directly from collection
• Created by authorized users to share data they
have access to
• Permissions at folder level, for user/group/service
credential to access data
• Roles for granting management rights
Let’s try it…
• Create guest collection
• Set permissions
• Set Roles
Tutorial cheatsheet: bit.ly/gw-tut-rpi
Considerations for using data sharing
• Guest collection creation cannot be automated
• Permission management can be fully automated
• Typical pattern:
– One guest collection
– Many permissions per folder
• Clean up permissions and guest collections when not
in use
Administrator controls
• Enable use of sharing
• What parts of the file system
• Which users
• What level of sharing (read-only)
• Share with users in specific domains
• Monitor and manage permissions on guest
collection
Guest collections can be applied in
variety of scenarios
Data from instruments
• Provide near-real time
access to data
• Automated permissions
based on site policy
• Self managed by the PI
• Federated login to
access data
Raw data store
Personal Computer
Remote
visualization/analysis
Local
policy
store
--/cohort045
--/cohort096
--/cohort127
Distribution from data archive/repository
• Portal/science gateway
to distribute data
• Interface to search and
gather data of interest
• Asynchronous transfer
to user’s system or via
HTTPS to “staged” data
• Fine-grained
authorization enforced
Search and request
data of interest
Transfer
data to
destination
Example: Instrument data delivery at scale
Use Globus to deliver
100s of TB of genomic
data to researchers
Credits: Joe George, University of Michigan
Core center data processing
• Allow user to securely
upload data for analysis
• Make analysis results
available to user
• Automate setup and
tear down of folders
and permissions
--/123/input rw
Analysis System
--/123/output r
Automate guest collection
management
What do you need to automate?
• Service accounts or application credentials
– Client id and Secret
– Identity of the application: client_id@clients.auth.globus.org
• Guest collection
• Permission for service account to manage the guest
collection
Using service
accounts
14
Registering a service account
• Webapp - Settings
– app.globus.org/settings/developers
Accessing data using service accounts
• Service accounts have a client id and secret
• There is no user involved, so there is no consent
• Transfer service sees the request identity as
client_id@clients.auth.globus.org
• The identity must have permissions for operations
– E.g. For transfer: read at source, and write at destination
– E.g. For permission management: must have access manager
role
Let’s try allowing an app manage permission
• Create guest collection or use one you
already created
• Set permission for service account via
webapp
• Try accessing data as guest collection
Tutorial cheatsheet: bit.ly/gw-tut-rpi
Grant a role for the service account
• Set access
manager for
service account
to manage
permissions
Let’s try allowing an app manage permission
• Set Access Manager role for the service
account
• Try listing/setting permissions on the
guest collection
Tutorial cheatsheet: bit.ly/gw-tut-rpi

More Related Content

Similar to Best Practices for Data Sharing Using Globus

Hive contributors meetup apache sentry
Hive contributors meetup   apache sentryHive contributors meetup   apache sentry
Hive contributors meetup apache sentry
Brock Noland
 
Automating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusAutomating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with Globus
Globus
 

Similar to Best Practices for Data Sharing Using Globus (20)

Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
 
Globus High Assurance for Protected Data (GlobusWorld Tour - Columbia Univers...
Globus High Assurance for Protected Data (GlobusWorld Tour - Columbia Univers...Globus High Assurance for Protected Data (GlobusWorld Tour - Columbia Univers...
Globus High Assurance for Protected Data (GlobusWorld Tour - Columbia Univers...
 
Introduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for ResearchersIntroduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for Researchers
 
Globus presentation
Globus presentationGlobus presentation
Globus presentation
 
Authentication Authorization-Lesson-2-Slides.ppt
Authentication Authorization-Lesson-2-Slides.pptAuthentication Authorization-Lesson-2-Slides.ppt
Authentication Authorization-Lesson-2-Slides.ppt
 
Scalable Data Management: Automation and the Modern Research Data Portal
Scalable Data Management: Automation and the Modern Research Data PortalScalable Data Management: Automation and the Modern Research Data Portal
Scalable Data Management: Automation and the Modern Research Data Portal
 
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
 
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
 
Hive contributors meetup apache sentry
Hive contributors meetup   apache sentryHive contributors meetup   apache sentry
Hive contributors meetup apache sentry
 
Introduction to Globus for New Users
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New Users
 
Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)
Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)
Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)
 
Automating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusAutomating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with Globus
 
Globus status and publication plans
Globus status and publication plansGlobus status and publication plans
Globus status and publication plans
 
Introduction to Globus (GlobusWorld Tour West)
Introduction to Globus (GlobusWorld Tour West)Introduction to Globus (GlobusWorld Tour West)
Introduction to Globus (GlobusWorld Tour West)
 
GlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to GlobusGlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to Globus
 
Instrument Data Orchestration with Globus Search and Flows
Instrument Data Orchestration with Globus Search and FlowsInstrument Data Orchestration with Globus Search and Flows
Instrument Data Orchestration with Globus Search and Flows
 
Introduction to Globus for Researchers
Introduction to Globus for ResearchersIntroduction to Globus for Researchers
Introduction to Globus for Researchers
 
Jupyter + Globus: The Foundation for Interactive Data Science
Jupyter + Globus: The Foundation for Interactive Data ScienceJupyter + Globus: The Foundation for Interactive Data Science
Jupyter + Globus: The Foundation for Interactive Data Science
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Introduction to Globus
Introduction to GlobusIntroduction to Globus
Introduction to Globus
 

More from Globus

Providing Globus Services to Users Of JASMIN for Environmental Data Analysis
Providing Globus Services to Users Of JASMIN for Environmental Data AnalysisProviding Globus Services to Users Of JASMIN for Environmental Data Analysis
Providing Globus Services to Users Of JASMIN for Environmental Data Analysis
Globus
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Extending Globus into a Site-wide Automated Data Infrastructure
Extending Globus into a Site-wide Automated Data InfrastructureExtending Globus into a Site-wide Automated Data Infrastructure
Extending Globus into a Site-wide Automated Data Infrastructure
Globus
 

More from Globus (20)

Reactive Documents and Computational Pipelines
Reactive Documents and Computational PipelinesReactive Documents and Computational Pipelines
Reactive Documents and Computational Pipelines
 
Providing Globus Services to Users Of JASMIN for Environmental Data Analysis
Providing Globus Services to Users Of JASMIN for Environmental Data AnalysisProviding Globus Services to Users Of JASMIN for Environmental Data Analysis
Providing Globus Services to Users Of JASMIN for Environmental Data Analysis
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Innovating Inference: Remote Triggering of Large Language Models on HPC Clust...
Innovating Inference: Remote Triggering of Large Language Models on HPC Clust...Innovating Inference: Remote Triggering of Large Language Models on HPC Clust...
Innovating Inference: Remote Triggering of Large Language Models on HPC Clust...
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
GlobusWorld 2024: Opening Keynote Address
GlobusWorld 2024: Opening Keynote AddressGlobusWorld 2024: Opening Keynote Address
GlobusWorld 2024: Opening Keynote Address
 
Globus Connect Server Deep Dive - Advanced Configuration Options and Use Cases
Globus Connect Server Deep Dive - Advanced Configuration Options and Use CasesGlobus Connect Server Deep Dive - Advanced Configuration Options and Use Cases
Globus Connect Server Deep Dive - Advanced Configuration Options and Use Cases
 
Globus Compute with Integrated Research Infrastructure (IRI) Workflows
Globus Compute with Integrated Research Infrastructure (IRI) WorkflowsGlobus Compute with Integrated Research Infrastructure (IRI) Workflows
Globus Compute with Integrated Research Infrastructure (IRI) Workflows
 
Exploring Innovations in Data Repository Solutions Insights from the U.S. Geo...
Exploring Innovations in Data Repository Solutions Insights from the U.S. Geo...Exploring Innovations in Data Repository Solutions Insights from the U.S. Geo...
Exploring Innovations in Data Repository Solutions Insights from the U.S. Geo...
 
Globus at the U.S. Geological Survey (USGS)
Globus at the U.S. Geological Survey (USGS)Globus at the U.S. Geological Survey (USGS)
Globus at the U.S. Geological Survey (USGS)
 
Globus and the Integrated Research Infrastructure (IRI)
Globus and the Integrated Research Infrastructure (IRI)Globus and the Integrated Research Infrastructure (IRI)
Globus and the Integrated Research Infrastructure (IRI)
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Extending Globus into a Site-wide Automated Data Infrastructure
Extending Globus into a Site-wide Automated Data InfrastructureExtending Globus into a Site-wide Automated Data Infrastructure
Extending Globus into a Site-wide Automated Data Infrastructure
 
Enhancing Research Orchestration Capabilities at ORNL.pptx
Enhancing Research Orchestration Capabilities at ORNL.pptxEnhancing Research Orchestration Capabilities at ORNL.pptx
Enhancing Research Orchestration Capabilities at ORNL.pptx
 
Enhancing Performance with Globus and the Science DMZ.pdf
Enhancing Performance with Globus and the Science DMZ.pdfEnhancing Performance with Globus and the Science DMZ.pdf
Enhancing Performance with Globus and the Science DMZ.pdf
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Climate Science Flows Enabling Petabyte-Scale Climate Analysis with the Earth...
Climate Science Flows Enabling Petabyte-Scale Climate Analysis with the Earth...Climate Science Flows Enabling Petabyte-Scale Climate Analysis with the Earth...
Climate Science Flows Enabling Petabyte-Scale Climate Analysis with the Earth...
 
Introduction to Globus Compute - GlobusWorld 2024
Introduction to Globus Compute - GlobusWorld 2024Introduction to Globus Compute - GlobusWorld 2024
Introduction to Globus Compute - GlobusWorld 2024
 
Advanced Globus System Administration Topics
Advanced Globus System Administration TopicsAdvanced Globus System Administration Topics
Advanced Globus System Administration Topics
 
Instrument Data Automation: The Life of a Flow
Instrument Data Automation: The Life of a FlowInstrument Data Automation: The Life of a Flow
Instrument Data Automation: The Life of a Flow
 

Recently uploaded

Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Lisi Hocke
 
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
drm1699
 

Recently uploaded (20)

Abortion Pill Prices Jozini ](+27832195400*)[ 🏥 Women's Abortion Clinic in Jo...
Abortion Pill Prices Jozini ](+27832195400*)[ 🏥 Women's Abortion Clinic in Jo...Abortion Pill Prices Jozini ](+27832195400*)[ 🏥 Women's Abortion Clinic in Jo...
Abortion Pill Prices Jozini ](+27832195400*)[ 🏥 Women's Abortion Clinic in Jo...
 
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
Workshop -  Architecting Innovative Graph Applications- GraphSummit MilanWorkshop -  Architecting Innovative Graph Applications- GraphSummit Milan
Workshop - Architecting Innovative Graph Applications- GraphSummit Milan
 
Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?
 
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaUNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
 
Weeding your micro service landscape.pdf
Weeding your micro service landscape.pdfWeeding your micro service landscape.pdf
Weeding your micro service landscape.pdf
 
Abortion Pill Prices Jane Furse ](+27832195400*)[ 🏥 Women's Abortion Clinic i...
Abortion Pill Prices Jane Furse ](+27832195400*)[ 🏥 Women's Abortion Clinic i...Abortion Pill Prices Jane Furse ](+27832195400*)[ 🏥 Women's Abortion Clinic i...
Abortion Pill Prices Jane Furse ](+27832195400*)[ 🏥 Women's Abortion Clinic i...
 
Effective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeConEffective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeCon
 
The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)The mythical technical debt. (Brooke, please, forgive me)
The mythical technical debt. (Brooke, please, forgive me)
 
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
 
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
Team Transformation Tactics for Holistic Testing and Quality (NewCrafts Paris...
 
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
 
BusinessGPT - Security and Governance for Generative AI
BusinessGPT  - Security and Governance for Generative AIBusinessGPT  - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AI
 
Modern binary build systems - PyCon 2024
Modern binary build systems - PyCon 2024Modern binary build systems - PyCon 2024
Modern binary build systems - PyCon 2024
 
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
 
Abortion Clinic In Pretoria ](+27832195400*)[ 🏥 Safe Abortion Pills in Pretor...
Abortion Clinic In Pretoria ](+27832195400*)[ 🏥 Safe Abortion Pills in Pretor...Abortion Clinic In Pretoria ](+27832195400*)[ 🏥 Safe Abortion Pills in Pretor...
Abortion Clinic In Pretoria ](+27832195400*)[ 🏥 Safe Abortion Pills in Pretor...
 
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
 
Automate your OpenSIPS config tests - OpenSIPS Summit 2024
Automate your OpenSIPS config tests - OpenSIPS Summit 2024Automate your OpenSIPS config tests - OpenSIPS Summit 2024
Automate your OpenSIPS config tests - OpenSIPS Summit 2024
 
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
 
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4jGraphSummit Milan - Visione e roadmap del prodotto Neo4j
GraphSummit Milan - Visione e roadmap del prodotto Neo4j
 

Best Practices for Data Sharing Using Globus

  • 1. Rachana Ananthakrishnan ranantha@uchicago.edu February 28, 2024 Best Practices for Data Sharing
  • 2. Secure data sharing …from any storage Collaborator logs into Globus and accesses shared files; no local account required; download via Globus 2 On-prem or public cloud storage Select files to share, select user or group, and set access permissions 1 Globally accessible multi-tenant service Globus controls access to shared files on existing storage Laptop, server, compute facility • Fine-grained access control “overlay” on storage system • Share with any identity, email, group • No need to stage data just for sharing v
  • 3. Guest collections • Directly addressable entities • Bulk data access (via Globus transfer service) • HTTP/S access directly from collection • Created by authorized users to share data they have access to • Permissions at folder level, for user/group/service credential to access data • Roles for granting management rights
  • 4. Let’s try it… • Create guest collection • Set permissions • Set Roles Tutorial cheatsheet: bit.ly/gw-tut-rpi
  • 5. Considerations for using data sharing • Guest collection creation cannot be automated • Permission management can be fully automated • Typical pattern: – One guest collection – Many permissions per folder • Clean up permissions and guest collections when not in use
  • 6. Administrator controls • Enable use of sharing • What parts of the file system • Which users • What level of sharing (read-only) • Share with users in specific domains • Monitor and manage permissions on guest collection
  • 7. Guest collections can be applied in variety of scenarios
  • 8. Data from instruments • Provide near-real time access to data • Automated permissions based on site policy • Self managed by the PI • Federated login to access data Raw data store Personal Computer Remote visualization/analysis Local policy store --/cohort045 --/cohort096 --/cohort127
  • 9. Distribution from data archive/repository • Portal/science gateway to distribute data • Interface to search and gather data of interest • Asynchronous transfer to user’s system or via HTTPS to “staged” data • Fine-grained authorization enforced Search and request data of interest Transfer data to destination
  • 10. Example: Instrument data delivery at scale Use Globus to deliver 100s of TB of genomic data to researchers Credits: Joe George, University of Michigan
  • 11. Core center data processing • Allow user to securely upload data for analysis • Make analysis results available to user • Automate setup and tear down of folders and permissions --/123/input rw Analysis System --/123/output r
  • 13. What do you need to automate? • Service accounts or application credentials – Client id and Secret – Identity of the application: client_id@clients.auth.globus.org • Guest collection • Permission for service account to manage the guest collection
  • 15. Registering a service account • Webapp - Settings – app.globus.org/settings/developers
  • 16. Accessing data using service accounts • Service accounts have a client id and secret • There is no user involved, so there is no consent • Transfer service sees the request identity as client_id@clients.auth.globus.org • The identity must have permissions for operations – E.g. For transfer: read at source, and write at destination – E.g. For permission management: must have access manager role
  • 17. Let’s try allowing an app manage permission • Create guest collection or use one you already created • Set permission for service account via webapp • Try accessing data as guest collection Tutorial cheatsheet: bit.ly/gw-tut-rpi
  • 18. Grant a role for the service account • Set access manager for service account to manage permissions
  • 19. Let’s try allowing an app manage permission • Set Access Manager role for the service account • Try listing/setting permissions on the guest collection Tutorial cheatsheet: bit.ly/gw-tut-rpi