SlideShare a Scribd company logo
1 of 19
Download to read offline
Rachana Ananthakrishnan
ranantha@uchicago.edu
February 28, 2024
Best Practices for Data Sharing
Secure data sharing …from any storage
Collaborator logs into Globus
and accesses shared files;
no local account required;
download via Globus
2
On-prem or
public cloud
storage
Select files to share,
select user or group,
and set access
permissions
1
Globally accessible
multi-tenant service
Globus controls
access to shared files
on existing storage
Laptop, server,
compute facility
• Fine-grained access
control “overlay” on
storage system
• Share with any identity,
email, group
• No need to stage data just
for sharing
v
Guest collections
• Directly addressable entities
• Bulk data access (via Globus transfer service)
• HTTP/S access directly from collection
• Created by authorized users to share data they
have access to
• Permissions at folder level, for user/group/service
credential to access data
• Roles for granting management rights
Let’s try it…
• Create guest collection
• Set permissions
• Set Roles
Tutorial cheatsheet: bit.ly/gw-tut-rpi
Considerations for using data sharing
• Guest collection creation cannot be automated
• Permission management can be fully automated
• Typical pattern:
– One guest collection
– Many permissions per folder
• Clean up permissions and guest collections when not
in use
Administrator controls
• Enable use of sharing
• What parts of the file system
• Which users
• What level of sharing (read-only)
• Share with users in specific domains
• Monitor and manage permissions on guest
collection
Guest collections can be applied in
variety of scenarios
Data from instruments
• Provide near-real time
access to data
• Automated permissions
based on site policy
• Self managed by the PI
• Federated login to
access data
Raw data store
Personal Computer
Remote
visualization/analysis
Local
policy
store
--/cohort045
--/cohort096
--/cohort127
Distribution from data archive/repository
• Portal/science gateway
to distribute data
• Interface to search and
gather data of interest
• Asynchronous transfer
to user’s system or via
HTTPS to “staged” data
• Fine-grained
authorization enforced
Search and request
data of interest
Transfer
data to
destination
Example: Instrument data delivery at scale
Use Globus to deliver
100s of TB of genomic
data to researchers
Credits: Joe George, University of Michigan
Core center data processing
• Allow user to securely
upload data for analysis
• Make analysis results
available to user
• Automate setup and
tear down of folders
and permissions
--/123/input rw
Analysis System
--/123/output r
Automate guest collection
management
What do you need to automate?
• Service accounts or application credentials
– Client id and Secret
– Identity of the application: client_id@clients.auth.globus.org
• Guest collection
• Permission for service account to manage the guest
collection
Using service
accounts
14
Registering a service account
• Webapp - Settings
– app.globus.org/settings/developers
Accessing data using service accounts
• Service accounts have a client id and secret
• There is no user involved, so there is no consent
• Transfer service sees the request identity as
client_id@clients.auth.globus.org
• The identity must have permissions for operations
– E.g. For transfer: read at source, and write at destination
– E.g. For permission management: must have access manager
role
Let’s try allowing an app manage permission
• Create guest collection or use one you
already created
• Set permission for service account via
webapp
• Try accessing data as guest collection
Tutorial cheatsheet: bit.ly/gw-tut-rpi
Grant a role for the service account
• Set access
manager for
service account
to manage
permissions
Let’s try allowing an app manage permission
• Set Access Manager role for the service
account
• Try listing/setting permissions on the
guest collection
Tutorial cheatsheet: bit.ly/gw-tut-rpi

More Related Content

Similar to Best Practices for Data Sharing Using Globus

Hive contributors meetup apache sentry
Hive contributors meetup   apache sentryHive contributors meetup   apache sentry
Hive contributors meetup apache sentry
Brock Noland
 
Automating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusAutomating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with Globus
Globus
 

Similar to Best Practices for Data Sharing Using Globus (20)

Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
 
Globus High Assurance for Protected Data (GlobusWorld Tour - Columbia Univers...
Globus High Assurance for Protected Data (GlobusWorld Tour - Columbia Univers...Globus High Assurance for Protected Data (GlobusWorld Tour - Columbia Univers...
Globus High Assurance for Protected Data (GlobusWorld Tour - Columbia Univers...
 
Introduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for ResearchersIntroduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for Researchers
 
Globus presentation
Globus presentationGlobus presentation
Globus presentation
 
Authentication Authorization-Lesson-2-Slides.ppt
Authentication Authorization-Lesson-2-Slides.pptAuthentication Authorization-Lesson-2-Slides.ppt
Authentication Authorization-Lesson-2-Slides.ppt
 
Scalable Data Management: Automation and the Modern Research Data Portal
Scalable Data Management: Automation and the Modern Research Data PortalScalable Data Management: Automation and the Modern Research Data Portal
Scalable Data Management: Automation and the Modern Research Data Portal
 
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
 
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
 
Hive contributors meetup apache sentry
Hive contributors meetup   apache sentryHive contributors meetup   apache sentry
Hive contributors meetup apache sentry
 
Introduction to Globus for New Users
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New Users
 
Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)
Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)
Globus High Assurance for Protected Data (GlobusWorld Tour - UCSD)
 
Automating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusAutomating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with Globus
 
Globus status and publication plans
Globus status and publication plansGlobus status and publication plans
Globus status and publication plans
 
Introduction to Globus (GlobusWorld Tour West)
Introduction to Globus (GlobusWorld Tour West)Introduction to Globus (GlobusWorld Tour West)
Introduction to Globus (GlobusWorld Tour West)
 
GlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to GlobusGlobusWorld 2021 Tutorial: Introduction to Globus
GlobusWorld 2021 Tutorial: Introduction to Globus
 
Instrument Data Orchestration with Globus Search and Flows
Instrument Data Orchestration with Globus Search and FlowsInstrument Data Orchestration with Globus Search and Flows
Instrument Data Orchestration with Globus Search and Flows
 
Introduction to Globus for Researchers
Introduction to Globus for ResearchersIntroduction to Globus for Researchers
Introduction to Globus for Researchers
 
Jupyter + Globus: The Foundation for Interactive Data Science
Jupyter + Globus: The Foundation for Interactive Data ScienceJupyter + Globus: The Foundation for Interactive Data Science
Jupyter + Globus: The Foundation for Interactive Data Science
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Introduction to Globus
Introduction to GlobusIntroduction to Globus
Introduction to Globus
 

More from Globus

Providing Globus Services to Users Of JASMIN for Environmental Data Analysis
Providing Globus Services to Users Of JASMIN for Environmental Data AnalysisProviding Globus Services to Users Of JASMIN for Environmental Data Analysis
Providing Globus Services to Users Of JASMIN for Environmental Data Analysis
Globus
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Extending Globus into a Site-wide Automated Data Infrastructure
Extending Globus into a Site-wide Automated Data InfrastructureExtending Globus into a Site-wide Automated Data Infrastructure
Extending Globus into a Site-wide Automated Data Infrastructure
Globus
 

More from Globus (20)

The Department of Energy's Integrated Research Infrastructure (IRI).pdf
The Department of Energy's Integrated Research Infrastructure (IRI).pdfThe Department of Energy's Integrated Research Infrastructure (IRI).pdf
The Department of Energy's Integrated Research Infrastructure (IRI).pdf
 
Research Automation with Globus Flows.pptx
Research Automation with Globus Flows.pptxResearch Automation with Globus Flows.pptx
Research Automation with Globus Flows.pptx
 
Reactive Documents and Computational Pipelines
Reactive Documents and Computational PipelinesReactive Documents and Computational Pipelines
Reactive Documents and Computational Pipelines
 
Providing Globus Services to Users Of JASMIN for Environmental Data Analysis
Providing Globus Services to Users Of JASMIN for Environmental Data AnalysisProviding Globus Services to Users Of JASMIN for Environmental Data Analysis
Providing Globus Services to Users Of JASMIN for Environmental Data Analysis
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Innovating Inference: Remote Triggering of Large Language Models on HPC Clust...
Innovating Inference: Remote Triggering of Large Language Models on HPC Clust...Innovating Inference: Remote Triggering of Large Language Models on HPC Clust...
Innovating Inference: Remote Triggering of Large Language Models on HPC Clust...
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
GlobusWorld 2024: Opening Keynote Address
GlobusWorld 2024: Opening Keynote AddressGlobusWorld 2024: Opening Keynote Address
GlobusWorld 2024: Opening Keynote Address
 
Globus Connect Server Deep Dive - Advanced Configuration Options and Use Cases
Globus Connect Server Deep Dive - Advanced Configuration Options and Use CasesGlobus Connect Server Deep Dive - Advanced Configuration Options and Use Cases
Globus Connect Server Deep Dive - Advanced Configuration Options and Use Cases
 
Globus Compute with Integrated Research Infrastructure (IRI) Workflows
Globus Compute with Integrated Research Infrastructure (IRI) WorkflowsGlobus Compute with Integrated Research Infrastructure (IRI) Workflows
Globus Compute with Integrated Research Infrastructure (IRI) Workflows
 
Exploring Innovations in Data Repository Solutions Insights from the U.S. Geo...
Exploring Innovations in Data Repository Solutions Insights from the U.S. Geo...Exploring Innovations in Data Repository Solutions Insights from the U.S. Geo...
Exploring Innovations in Data Repository Solutions Insights from the U.S. Geo...
 
Globus at the U.S. Geological Survey (USGS)
Globus at the U.S. Geological Survey (USGS)Globus at the U.S. Geological Survey (USGS)
Globus at the U.S. Geological Survey (USGS)
 
Globus and the Integrated Research Infrastructure (IRI)
Globus and the Integrated Research Infrastructure (IRI)Globus and the Integrated Research Infrastructure (IRI)
Globus and the Integrated Research Infrastructure (IRI)
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Extending Globus into a Site-wide Automated Data Infrastructure
Extending Globus into a Site-wide Automated Data InfrastructureExtending Globus into a Site-wide Automated Data Infrastructure
Extending Globus into a Site-wide Automated Data Infrastructure
 
Enhancing Research Orchestration Capabilities at ORNL.pptx
Enhancing Research Orchestration Capabilities at ORNL.pptxEnhancing Research Orchestration Capabilities at ORNL.pptx
Enhancing Research Orchestration Capabilities at ORNL.pptx
 
Enhancing Performance with Globus and the Science DMZ.pdf
Enhancing Performance with Globus and the Science DMZ.pdfEnhancing Performance with Globus and the Science DMZ.pdf
Enhancing Performance with Globus and the Science DMZ.pdf
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Climate Science Flows Enabling Petabyte-Scale Climate Analysis with the Earth...
Climate Science Flows Enabling Petabyte-Scale Climate Analysis with the Earth...Climate Science Flows Enabling Petabyte-Scale Climate Analysis with the Earth...
Climate Science Flows Enabling Petabyte-Scale Climate Analysis with the Earth...
 
Introduction to Globus Compute - GlobusWorld 2024
Introduction to Globus Compute - GlobusWorld 2024Introduction to Globus Compute - GlobusWorld 2024
Introduction to Globus Compute - GlobusWorld 2024
 

Recently uploaded

Jax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined DeckJax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined Deck
Marc Lester
 

Recently uploaded (20)

CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
 
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNovo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMs
 
Optimizing Operations by Aligning Resources with Strategic Objectives Using O...
Optimizing Operations by Aligning Resources with Strategic Objectives Using O...Optimizing Operations by Aligning Resources with Strategic Objectives Using O...
Optimizing Operations by Aligning Resources with Strategic Objectives Using O...
 
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
 
Auto Affiliate AI Earns First Commission in 3 Hours..pdf
Auto Affiliate  AI Earns First Commission in 3 Hours..pdfAuto Affiliate  AI Earns First Commission in 3 Hours..pdf
Auto Affiliate AI Earns First Commission in 3 Hours..pdf
 
The Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdf
The Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdfThe Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdf
The Evolution of Web App Testing_ An Ultimate Guide to Future Trends.pdf
 
BusinessGPT - Security and Governance for Generative AI
BusinessGPT  - Security and Governance for Generative AIBusinessGPT  - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AI
 
Abortion Clinic Pretoria ](+27832195400*)[ Abortion Clinic Near Me ● Abortion...
Abortion Clinic Pretoria ](+27832195400*)[ Abortion Clinic Near Me ● Abortion...Abortion Clinic Pretoria ](+27832195400*)[ Abortion Clinic Near Me ● Abortion...
Abortion Clinic Pretoria ](+27832195400*)[ Abortion Clinic Near Me ● Abortion...
 
Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?
 
Abortion Pill Prices Turfloop ](+27832195400*)[ 🏥 Women's Abortion Clinic in ...
Abortion Pill Prices Turfloop ](+27832195400*)[ 🏥 Women's Abortion Clinic in ...Abortion Pill Prices Turfloop ](+27832195400*)[ 🏥 Women's Abortion Clinic in ...
Abortion Pill Prices Turfloop ](+27832195400*)[ 🏥 Women's Abortion Clinic in ...
 
Food Delivery Business App Development Guide 2024
Food Delivery Business App Development Guide 2024Food Delivery Business App Development Guide 2024
Food Delivery Business App Development Guide 2024
 
OpenChain Webinar: AboutCode and Beyond - End-to-End SCA
OpenChain Webinar: AboutCode and Beyond - End-to-End SCAOpenChain Webinar: AboutCode and Beyond - End-to-End SCA
OpenChain Webinar: AboutCode and Beyond - End-to-End SCA
 
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
 
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
 
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
 
From Theory to Practice: Utilizing SpiraPlan's REST API
From Theory to Practice: Utilizing SpiraPlan's REST APIFrom Theory to Practice: Utilizing SpiraPlan's REST API
From Theory to Practice: Utilizing SpiraPlan's REST API
 
Community is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea GouletCommunity is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea Goulet
 
A Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdfA Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdf
 
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptxFrom Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
 
Jax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined DeckJax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined Deck
 

Best Practices for Data Sharing Using Globus

  • 1. Rachana Ananthakrishnan ranantha@uchicago.edu February 28, 2024 Best Practices for Data Sharing
  • 2. Secure data sharing …from any storage Collaborator logs into Globus and accesses shared files; no local account required; download via Globus 2 On-prem or public cloud storage Select files to share, select user or group, and set access permissions 1 Globally accessible multi-tenant service Globus controls access to shared files on existing storage Laptop, server, compute facility • Fine-grained access control “overlay” on storage system • Share with any identity, email, group • No need to stage data just for sharing v
  • 3. Guest collections • Directly addressable entities • Bulk data access (via Globus transfer service) • HTTP/S access directly from collection • Created by authorized users to share data they have access to • Permissions at folder level, for user/group/service credential to access data • Roles for granting management rights
  • 4. Let’s try it… • Create guest collection • Set permissions • Set Roles Tutorial cheatsheet: bit.ly/gw-tut-rpi
  • 5. Considerations for using data sharing • Guest collection creation cannot be automated • Permission management can be fully automated • Typical pattern: – One guest collection – Many permissions per folder • Clean up permissions and guest collections when not in use
  • 6. Administrator controls • Enable use of sharing • What parts of the file system • Which users • What level of sharing (read-only) • Share with users in specific domains • Monitor and manage permissions on guest collection
  • 7. Guest collections can be applied in variety of scenarios
  • 8. Data from instruments • Provide near-real time access to data • Automated permissions based on site policy • Self managed by the PI • Federated login to access data Raw data store Personal Computer Remote visualization/analysis Local policy store --/cohort045 --/cohort096 --/cohort127
  • 9. Distribution from data archive/repository • Portal/science gateway to distribute data • Interface to search and gather data of interest • Asynchronous transfer to user’s system or via HTTPS to “staged” data • Fine-grained authorization enforced Search and request data of interest Transfer data to destination
  • 10. Example: Instrument data delivery at scale Use Globus to deliver 100s of TB of genomic data to researchers Credits: Joe George, University of Michigan
  • 11. Core center data processing • Allow user to securely upload data for analysis • Make analysis results available to user • Automate setup and tear down of folders and permissions --/123/input rw Analysis System --/123/output r
  • 13. What do you need to automate? • Service accounts or application credentials – Client id and Secret – Identity of the application: client_id@clients.auth.globus.org • Guest collection • Permission for service account to manage the guest collection
  • 15. Registering a service account • Webapp - Settings – app.globus.org/settings/developers
  • 16. Accessing data using service accounts • Service accounts have a client id and secret • There is no user involved, so there is no consent • Transfer service sees the request identity as client_id@clients.auth.globus.org • The identity must have permissions for operations – E.g. For transfer: read at source, and write at destination – E.g. For permission management: must have access manager role
  • 17. Let’s try allowing an app manage permission • Create guest collection or use one you already created • Set permission for service account via webapp • Try accessing data as guest collection Tutorial cheatsheet: bit.ly/gw-tut-rpi
  • 18. Grant a role for the service account • Set access manager for service account to manage permissions
  • 19. Let’s try allowing an app manage permission • Set Access Manager role for the service account • Try listing/setting permissions on the guest collection Tutorial cheatsheet: bit.ly/gw-tut-rpi