Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

GENI Engineering Conference -- Ian Foster


Published on

I was invited to talk at the 18th GENI Engineering Conference ( on experiences in the Grid community with creating and operating large shared infrastructures. I chose to focus on our experiences using Software as a Service (SaaS: aka Cloud) to reduce barriers to the use of the capabilities required to create and operate virtual organizations.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

GENI Engineering Conference -- Ian Foster

  1. 1. Hosted services for managing shared cyberinfrastructure Ian Foster Argonne National Laboratory & The University of Chicago Joint work with Rachana Ananthakrishnan, Josh Bryan, Kyle Chard, Mattias Lidman, Steven Tuecke, and others GENI Engineering Conference, NYC, October 28, 2013
  2. 2. Using cloud services to accelerate discovery Ian Foster Argonne National Laboratory & The University of Chicago Joint work with Rachana Ananthakrishnan, Josh Bryan, Kyle Chard, Mattias Lidman, Steven Tuecke, and others GENI Engineering Conference, NYC, October 28, 2013
  3. 3. Cyberinfrastructure • “a technological and sociological solution to the problem of efficiently connecting laboratories, data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge” [Wikipedia] • AKA eScience, eResearch, Computer Supported Collaborative Work, Grid, … 3
  4. 4. “The Anatomy of the Grid,” 2001 The … problem that underlies the Grid concept is coordinated resource sharing and problem solving in dynamic, multiinstitutional virtual organizations. The sharing that we are concerned with is not primarily file exchange but rather direct access to computers, software, data, and other resources, as is required by a range of collaborative problem-solving and resource-brokering strategies emerging in industry, science, and engineering. This sharing is, necessarily, highly controlled, with resource providers and consumers defining clearly and carefully just what is shared, who is allowed to share, and the conditions under which sharing occurs. A set of individuals and/or institutions defined by such sharing rules form what we call a virtual organization (VO). 4
  5. 5. Grid technology accelerates discovery Higgs discovery “only possible because of the extraordinary achievements of … grid computing”—Rolf Heuer, CERN DG Large Hadron Collider 5
  6. 6. LHC Computing Grid “virtual organizations”
  7. 7. Complexity in research is large and growing Run experiment Collect data Move data Check data Annotate data Share data Find similar data Link to literature Analyze data Publish data 8
  8. 8. Process automation for discovery Run experiment Collect data Move data Check data Annotate data Share data Find similar data Link to literature Analyze data Publish data 9 Discovery IT as a service
  9. 9. First: File transfer as a service 2 Globus Online Data Source moves and syncs files Data Destination 1 User initiates transfer request 3 Easy Fast Reliable Available Secure Globus Online notifies user 10
  10. 10. Early adoption is encouraging 12
  11. 11. Early adoption is encouraging 12,000 registered users; >150 daily >25 PB moved; >1B files 10x (or better) performance vs. scp 99.9% availability Entirely hosted on Amazon 13
  12. 12. Next: Share big data from existing storage 1 2 Globus Online Data tracks shared files; Source no need to move X Y files to cloud storage! User A selects 3 file(s) to share, User B logs in to selects user or Globus Online group, and sets and accesses permissions shared file File X: Users A, B: RW Directory Y: Group G: R 14
  13. 13. Sharing Service Transfer Service Globus Connect Globus Online is SaaS for science Globus Nexus (Identity, Group, Profile) Globus Toolkit 15 SaaS
  14. 14. Sharing Service Transfer Service Globus Connect Globus Online APIs We are now expanding to a platform Globus Nexus (Identity, Group, Profile) PaaS 16 Globus Toolkit SaaS
  15. 15. Sharing Service Transfer Service Globus Connect Globus Online APIs Globus Online: Platform-as-a-Service Globus Nexus (Identity, Group, Profile) Globus Toolkit 17
  16. 16. The identity challenge in science • Research communities often need to Assign identities to their users – Manage user profiles – Organize users into groups for authorization – • Obstacles to high-quality implementations Complexity of associated security protocols – Creation of identity silos – Multiple credentials for users – Reliability, availability, scalability, security – 18
  17. 17. Streamline collaborative tool development • Allows developers to focus on core application logic Sharing Service • Simplifies integration with campus infrastructure Transfer Service Globus Connect Globus Online APIs Custom Web Application Globus Nexus Globus Nexus (Identity, Group, Profile) (Identity, group, & profile management) Globus Toolkit 19
  18. 18. Nexus provides four key capabilities I• Identity provisioning – Create, manage Globus identities Key points: 1) Outsource I – Link with other identities; use I identity, group, to authenticate to services profile G management • Group hub I 2) REST API for V – User-managed groups; groups flexible U can be used for authorization integration 3) Intuitive, •b Profile management aI customizable – User-managed attributes; Web interfaces I I • Identity hub can use in group admission 20
  19. 19. I Identity provisioning Globus Nexus can act as an identity provider (IDP) for a project – User management, email validation… • DOE Systems Biology Knowledge Base (kBase) is an example of such a project. ~400 identities to date • 21
  20. 20. I I I Identity hub I • Link identities from other federated IDP(s) with a Nexus identity – • Use linked identity to authenticate to Nexus – • – Via OAuth or LDAP E.g., to Jira, Zendesk, Drupal, Confluence Have Nexus cache delegated credentials – 22 E.g., use campus identity, XSEDE identity (via OAuth) Leverage Nexus federated IDP to 3rd-party services – • E.g., InCommon/Campus (SAML), Google (OpenID), XSEDE (OAuth MyProxy), IGTF-certified X.509 CA, SSH X.509, via CILogon and MyProxy
  21. 21. Identity management 23
  22. 22. Identity hub: Biomedical science Dr. Smith creates a Nexus id, via BIRN project interface • Dr. Smith links campus id and XSEDE id Name: Dr. Smith Email: • Dr. Smith can then: • – – – – – Linked id: Campus Linked id: XSEDE Authenticate to BIRN with campus id Query catalog (Nexus/BIRN id) Campus (SAML) BIRN Request data transfer from BIRN Gateway to campus (Nexus and campus ids) OAuth Campus XSEDE Request transfer from BIRN identity identity Nexus identity to XSEDE (Nexus and XSEDE ids) Repeat these tasks: use cached XSEDE BIRN Campus credentials (BIRN=Biomedical Informatics Research Network) 24
  23. 23. Use linked identity 25 25
  24. 24. G I V U • • • Group hub User-managed group creation, management Flexible control over admission policies and visibility Groups can be used in authorization decisions Example: kBase • Every kBase user added to kbase_users • Subgroups also created • Groups used for access control 26 26
  25. 25. Group membership interface 27 27
  26. 26. Branded sites XSEDE Open Science Grid University of Chicago DOE kBase Indiana University University of Exeter NERSC NIH BIRN Globus Online 28
  27. 27. Implementation and deployment Elastic Load Balancer REST API Web REST API Web REST API Web Nexus Nexus Nexus OSSEC Logging Monitoring 29
  28. 28. Globus Nexus usage as of 9/13 14,000 – • 30 Largest group (kbase) has 402 members Total users 6,000 4,000 Aug-… May-… Feb-13 Nov-… Aug-… May-… Nov-… 0 Aug-… 2,000 Feb-12 – 1638 active members 229 pending or invited members 162 rejected or suspended members 8,000 May-… – 10,000 Feb-11 557 groups totaling: 12,000 Nov-… • >12,000 users and 4977 linked identities 1000 Users in group • 100 10 1 1 21 41 61 81 101 121
  29. 29. Identities and groups in XSEDE • Proposal: Replace current ad-hoc systems with Globus Nexus identity and group service – • Reduce complexity, reduce cost, increase capability Careful process of documentation and review “Architecture and development requirements: User and identity management” – “User management proposal: Affected use cases” – “User management proposal: Motivating stories” – “Proposal: Refactoring XSEDE identity and group capabilities” – • 31 Hope to reach closure by end of 2013
  30. 30. Cloud services to accelerate discovery Accelerate discovery and innovation worldwide by providing research IT as a service Leverage software-as-a-service to • provide millions of researchers with unprecedented access to powerful tools; • enable a massive shortening of cycle times in time-consuming research processes; and • reduce research IT costs dramatically via economies of scale 32
  31. 31. Thanks to ... U.S. DEPARTMENT OF ENERGY
  32. 32. Thank you! Questions?