I was invited to talk at the 18th GENI Engineering Conference (http://groups.geni.net/geni/wiki/GEC18Agenda) on experiences in the Grid community with creating and operating large shared infrastructures. I chose to focus on our experiences using Software as a Service (SaaS: aka Cloud) to reduce barriers to the use of the capabilities required to create and operate virtual organizations.
The 7 Things I Know About Cyber Security After 25 Years | April 2024
GENI Engineering Conference -- Ian Foster
1. Hosted services for
managing shared cyberinfrastructure
Ian Foster
Argonne National Laboratory & The University of Chicago
Joint work with Rachana Ananthakrishnan, Josh Bryan,
Kyle Chard, Mattias Lidman, Steven Tuecke, and others
GENI Engineering Conference, NYC, October 28, 2013
www.ci.anl.gov
www.ci.uchicago.edu
2. Using cloud services
to accelerate discovery
Ian Foster
Argonne National Laboratory & The University of Chicago
Joint work with Rachana Ananthakrishnan, Josh Bryan,
Kyle Chard, Mattias Lidman, Steven Tuecke, and others
GENI Engineering Conference, NYC, October 28, 2013
www.ci.anl.gov
www.ci.uchicago.edu
3. Cyberinfrastructure
•
“a technological and sociological solution to the
problem of efficiently connecting laboratories,
data, computers, and people with the goal of
enabling derivation of novel scientific theories
and knowledge” [Wikipedia]
•
AKA eScience, eResearch, Computer Supported
Collaborative Work, Grid, …
3
www.ci.anl.gov
www.ci.uchicago.edu
4. “The Anatomy of the Grid,” 2001
The … problem that underlies the Grid concept is coordinated
resource sharing and problem solving in dynamic, multiinstitutional virtual organizations. The sharing that we are
concerned with is not primarily file exchange but rather direct
access to computers, software, data, and other resources, as is
required by a range of collaborative problem-solving and
resource-brokering strategies emerging in industry, science, and
engineering. This sharing is, necessarily, highly controlled, with
resource providers and consumers defining clearly and carefully
just what is shared, who is allowed to share, and the conditions
under which sharing occurs. A set of individuals and/or
institutions defined by such sharing rules form what we call a
virtual organization (VO).
4
www.ci.anl.gov
www.ci.uchicago.edu
5. Grid technology accelerates discovery
Higgs discovery “only possible because of the extraordinary
achievements of … grid computing”—Rolf Heuer, CERN DG
Large Hadron Collider
5
www.ci.anl.gov
www.ci.uchicago.edu
8. Complexity in research is large and growing
Run experiment
Collect data
Move data
Check data
Annotate data
Share data
Find similar data
Link to literature
Analyze data
Publish data
8
www.ci.anl.gov
www.ci.uchicago.edu
9. Process automation for discovery
Run experiment
Collect data
Move data
Check data
Annotate data
Share data
Find similar data
Link to literature
Analyze data
Publish data
9
Discovery IT
as a service
www.ci.anl.gov
www.ci.uchicago.edu
10. First: File transfer as a service
2 Globus Online
Data
Source
moves and
syncs files
Data
Destination
1 User initiates
transfer request
3
Easy
Fast
Reliable
Available
Secure
Globus Online
notifies user
10
www.ci.anl.gov
www.ci.uchicago.edu
13. Early adoption is encouraging
12,000 registered users; >150 daily
>25 PB moved; >1B files
10x (or better) performance vs. scp
99.9% availability
Entirely hosted on Amazon
13
www.ci.anl.gov
www.ci.uchicago.edu
14. Next: Share big data from existing storage
1
2 Globus Online
Data
tracks shared files;
Source
no need to move
X Y
files to cloud
storage!
User A selects
3
file(s) to share,
User B logs in to
selects user or
Globus Online
group, and sets
and accesses
permissions
shared file
File X: Users A, B: RW
Directory Y: Group G: R
14
www.ci.anl.gov
www.ci.uchicago.edu
15. Sharing Service
Transfer Service
Globus Connect
Globus Online is SaaS for science
Globus Nexus
(Identity, Group, Profile)
Globus Toolkit
15
SaaS
www.ci.anl.gov
www.ci.uchicago.edu
16. Sharing Service
Transfer Service
Globus Connect
Globus Online APIs
We are now expanding to a platform
Globus Nexus
(Identity, Group, Profile)
PaaS
16
Globus Toolkit
SaaS
www.ci.anl.gov
www.ci.uchicago.edu
17. Sharing Service
Transfer Service
Globus Connect
Globus Online APIs
Globus Online: Platform-as-a-Service
Globus Nexus
(Identity, Group, Profile)
Globus Toolkit
17
www.ci.anl.gov
www.ci.uchicago.edu
18. The identity challenge in science
•
Research communities often need to
Assign identities to their users
– Manage user profiles
– Organize users into groups for authorization
–
•
Obstacles to high-quality implementations
Complexity of associated security protocols
– Creation of identity silos
– Multiple credentials for users
– Reliability, availability, scalability, security
–
18
www.ci.anl.gov
www.ci.uchicago.edu
19. Streamline collaborative tool development
• Allows developers to focus
on core application logic
Sharing Service
• Simplifies integration with
campus infrastructure
Transfer Service
Globus Connect
Globus Online APIs
Custom Web Application
Globus Nexus
Globus Nexus
(Identity, Group, Profile)
(Identity, group, & profile management)
Globus Toolkit
19
www.ci.anl.gov
www.ci.uchicago.edu
20. Nexus provides four key capabilities
I•
Identity provisioning
–
Create, manage Globus identities
Key points:
1) Outsource
I
– Link with other identities; use
I
identity, group,
to authenticate to services
profile
G
management
• Group hub
I
2) REST API for
V
– User-managed groups; groups
flexible
U
can be used for authorization
integration
3) Intuitive,
•b Profile management
aI
customizable
– User-managed attributes;
Web interfaces
I
I
•
Identity hub
can use in group admission
20
www.ci.anl.gov
www.ci.uchicago.edu
21. I
Identity provisioning
Globus Nexus can act as an identity provider (IDP) for a
project
– User management, email validation…
• DOE Systems Biology Knowledge Base (kBase) is an
example of such a project. ~400 identities to date
•
21
www.ci.anl.gov
www.ci.uchicago.edu
22. I
I
I
Identity hub
I
•
Link identities from other federated IDP(s) with a
Nexus identity
–
•
Use linked identity to authenticate to Nexus
–
•
–
Via OAuth or LDAP
E.g., to Jira, Zendesk, Drupal, Confluence
Have Nexus cache delegated credentials
–
22
E.g., use campus identity, XSEDE identity (via OAuth)
Leverage Nexus federated IDP to 3rd-party services
–
•
E.g., InCommon/Campus (SAML), Google
(OpenID), XSEDE (OAuth MyProxy), IGTF-certified X.509
CA, SSH
X.509, via CILogon and MyProxy
www.ci.anl.gov
www.ci.uchicago.edu
24. Identity hub: Biomedical science
Dr. Smith creates a Nexus id, via BIRN project interface
• Dr. Smith links campus id and XSEDE id Name: Dr. Smith
Email: smith@u.edu
• Dr. Smith can then:
•
–
–
–
–
–
Linked id: Campus
Linked id: XSEDE
Authenticate to BIRN with campus id
Query catalog (Nexus/BIRN id)
Campus
(SAML)
BIRN
Request data transfer from BIRN
Gateway
to campus (Nexus and campus ids) OAuth
Campus
XSEDE
Request transfer from BIRN
identity
identity Nexus
identity
to XSEDE (Nexus and XSEDE ids)
Repeat these tasks: use cached
XSEDE
BIRN
Campus
credentials
(BIRN=Biomedical Informatics Research Network)
24
www.ci.anl.gov
www.ci.uchicago.edu
26. G
I
V
U
•
•
•
Group hub
User-managed group creation, management
Flexible control over admission policies and visibility
Groups can be used in authorization decisions
Example: kBase
• Every kBase user
added to kbase_users
• Subgroups also
created
• Groups used for
access control
26
26
www.ci.anl.gov
www.ci.uchicago.edu
28. Branded sites
XSEDE
Open Science Grid
University of Chicago
DOE kBase
Indiana University
University of Exeter
NERSC
NIH BIRN
Globus Online
28
www.ci.anl.gov
www.ci.uchicago.edu
29. Implementation and deployment
Elastic Load Balancer
REST API
Web
REST API
Web
REST API
Web
Nexus
Nexus
Nexus
OSSEC
Logging
Monitoring
29
www.ci.anl.gov
www.ci.uchicago.edu
30. Globus Nexus usage as of 9/13
14,000
–
•
30
Largest group (kbase)
has 402 members
Total users
6,000
4,000
Aug-…
May-…
Feb-13
Nov-…
Aug-…
May-…
Nov-…
0
Aug-…
2,000
Feb-12
–
1638 active members
229 pending or
invited members
162 rejected or
suspended members
8,000
May-…
–
10,000
Feb-11
557 groups totaling:
12,000
Nov-…
•
>12,000 users
and 4977 linked
identities
1000
Users in group
•
100
10
1
1
21
41
61
81
101
121
www.ci.anl.gov
www.ci.uchicago.edu
31. Identities and groups in XSEDE
•
Proposal: Replace current ad-hoc systems with
Globus Nexus identity and group service
–
•
Reduce complexity, reduce cost, increase capability
Careful process of documentation and review
“Architecture and development requirements: User
and identity management”
– “User management proposal: Affected use cases”
– “User management proposal: Motivating stories”
– “Proposal: Refactoring XSEDE identity and group
capabilities”
–
•
31
Hope to reach closure by end of 2013
www.ci.anl.gov
www.ci.uchicago.edu
32. Cloud services to accelerate discovery
Accelerate discovery and innovation worldwide
by providing research IT as a service
Leverage software-as-a-service to
• provide millions of researchers with
unprecedented access to powerful tools;
• enable a massive shortening of cycle times in
time-consuming research processes; and
• reduce research IT costs dramatically via
economies of scale
32
www.ci.anl.gov
www.ci.uchicago.edu
33. Thanks to ...
U.S. DEPARTMENT OF
ENERGY
www.ci.anl.gov
www.ci.uchicago.edu
Foster, Kesselman, and Tuecke claimed that grids were all about “virtual organizations.”The way one should interpret that claim, I would assert, is in the context of Gilder’s comments. Things are distributed, for one reason or another—either via deliberate disintegration process, via outsourcing, or because they just started out distributed. Now we need to reassemble them, in a controlled manner. We gave some examples
173 TB/day
Question: Which steps can we outsource in that way?
Question: Which steps can we outsource in that way?
Globus Nexus makes it easy for individuals, teams, and institutions to create web applications for the science communityIt provides a flexible, powerful Platform-as-a-Service to which developers can outsource their identity, group, and profile management needsUsers encounter intuitive interfaces with common look and feel across different services
Four obstacles to collaborative application developmentBuild collaborative applications– Outsource identity, group and profilemanagement– REST API for flexible integration– Intuitive, customizable interfaces
slide 6: groups should have a use case. KBase is a good example. A few things we do for them: - All users that login to the KBase branded site (gologin.kbase.us) will automatically be added to a KBase group. Then then create sub-groups under that for various things. - They use groups for providing access control to various of their resources - They use the Nexus OAuth to get tokens that their clients can be used to authenticate with the KBase REST APIsCan define policies on groups – membership acceptance, invitation etc. Can set requirements for custom attributes for joiningGroups can be used for authorization decisionsWe use Groups for Crowd/Confluence, Drupal
InvitationsRolesPolicies
Different InterfacesAmazon-based infrastructure, high availability/elasticDistributed Architecture (AWS), uses ELBs to allocate workload, stateless Nexus servicesScalable/extensible graph model – we can change model easily and quicklyDistributed NoSQL databases to store schemaless graph efficientlyProfessional hosting, lots of other services like monitoring, logging, security, that are managed across GO.
More specifically, the opportunity is to apply a very modern technology—software as a service, or SaaS—to address a very modern problem, namely the enormous challenges inherent in translating revolutionary 21st century technologies into scientific advances. Our SaaS approach will address these challenges, and both make powerful tools far more widely available, and reduce the cycle time associated with research and discovery.Achieve economies of scaleReduce cost per researcher dramaticallyAchieve positive returns to scaleMost academic solutions do NOT have PRTSMost industrial solutions DO have PRTS