More Related Content Similar to Cto forum nirav_kapadia_2006_03_31_2006 Similar to Cto forum nirav_kapadia_2006_03_31_2006 (20) Cto forum nirav_kapadia_2006_03_31_20061. Grids, utility computing
and a perspective on
the future of IT infrastructure
Washington Area CTO Forum
March 31, 2006
Nirav Kapadia
nhkapadia@gmail.com
2. © Nirav
Kapadia 2
Outline
Characterizing computing grids
Grids as intended versus what we see today
Common types of grids today
Putting computing grids to work
Types of problems addressed by today’s grids
Operational considerations in deploying a grid
A perspective on the future of IT infrastructure
Cost pressures and technology commoditization
Grid and utility computing: the technology enablers
3. © Nirav
Kapadia 3
Grids came about from a need for
large scale, collaborative computing
Scale is measured in terms of users, nodes,
organizations, geography, and heterogeneity
A grid in the strict sense of the word involves a large
number of heterogeneous, shared resources
Collaboration is measured in terms of resource
sharing and interoperability
A key characteristic is the ability to manage across
organizational boundaries
4. © Nirav
Kapadia 4
Systems for large scale, collaborative
computing must meet key criteria
Group A
Scalable with users and resources
Support for heterogeneity
Group B
Support for interoperability
Scalable with geographical distances
Group C
Fully distributed (federated) architecture
Ability to compartmentalize along
organizational boundaries
Strictdefinitionofcomputinggrid
Broaddefinitionof
computinggrid
5. © Nirav
Kapadia 5
Many commercial grid solutions only
meet the broad definition of a grid
Cluster management systems
Typically harness clusters of dedicated servers
Examples include Platform LSF, Sun Grid Engine
CPU-scavenging “master-slave” applications
Typically take advantage of idle desktop cycles
Examples include SETI@Home, distributed.net
6. © Nirav
Kapadia 6
Many commercial grid solutions only
meet the broad definition of a grid
Application-specific, custom-built grids
Typically built around a key business function
Examples include Acxiom, Oracle offerings
7. © Nirav
Kapadia 7
Today, solutions that meet the strict
definition of a grid have to be “built”
Grid solutions based on the Globus toolkit
Several vendors have Globus based offerings
Univa Corp is commercializing Globus
Other grid solutions in academia and research
Most are custom-built and target a specific problem
Typically not appropriate for commercial use (today)
8. © Nirav
Kapadia 8
Key takeaways
A grid is a distributed computing system that
enables large scale, collaborative computing
Scalable across a large number of diverse and
geographically dispersed resources
Many commercial “grid solutions” of today do
not meet the strict definition of a grid
Limited ability to manage policies and resources
across administrative boundaries
9. © Nirav
Kapadia 9
Outline
Characterizing computing grids
Grids as intended versus what we see today
Common types of grids today
Putting computing grids to work
Types of problems addressed by today’s grids
Operational considerations in deploying a grid
A perspective on the future of IT infrastructure
Cost pressures and technology commoditization
Grid and utility computing: the technology enablers
10. © Nirav
Kapadia 10
Even today’s grids can benefit users
with large scale computing needs
High throughput computing (HTC)
Many independent (non-communicating) tasks
Large problems that break up into manageable,
independent tasks
High performance computing (HPC)
Large problem that is not decomposable into
manageable, independent tasks
11. © Nirav
Kapadia 11
High throughput computing is
common in business environments
Large, legacy applications are best served by
cluster management systems
Compute-intensive apps are preferable but a mix of
compute- and data-intensive apps are manageable
Customizable apps that work on small slices of
data work well with CPU-scavenging grids
Apps must be compute-intensive and preferably run
within a sandbox
12. © Nirav
Kapadia 12
High performance computing is
seen more in targeted environments
Applications involving multiple, communicating
tasks are typically require custom designed grid
environments
Examples include Oracle grid offering and some test
beds built with Globus
Other examples include distributed computing
platforms such as PVM and MPI
13. © Nirav
Kapadia 13
So… you’re ready to deploy a grid
computing environment…
As with any other technology, there are several
operational considerations…
Resources on the grid – dedicated or shared?
Access management – who needs access to what?
Data management – how does data get to the grid?
Security model employed by the grid
14. © Nirav
Kapadia 14
Resources on the grid –
should they be dedicated or shared?
Cluster Mgmt Systems
Cluster management
systems work best with
dedicated resources
Condor – from the U of
Wisconsin – is a notable
exception, but not
commercially available
CPU-scavenging grids
As the name implies,
resources are shared –
and typically involve
desktops
A custom screen saver is
the most common
vehicle for running the
grid application
15. © Nirav
Kapadia 15
Access management –
who needs (gets) access to what?
Cluster Mgmt Systems
Option #1: jobs run in a
guest account
Shared access across
jobs
Option #2: accounts for
everyone on all
machines
Homogeneous uid pool
highly recommended
Logins typically disabled
CPU-scavenging grids
Option #1: jobs run with
user’s privileges
If downloaded by user
Option #2: jobs run in
guest account
If set up by administrator
No direct remote user
access to desktop
16. © Nirav
Kapadia 16
Data management –
how does data get to the apps?
Cluster Mgmt Systems
Transfer user specified
files via ftp, scp, etc
File staging for large data
On demand file transfer
(system call traps)
Shared file systems
CPU-scavenging grids
Data embedded within
application or retrieved
via HTTP/Java call-
backs
Limited data, typically no
files
17. © Nirav
Kapadia 17
Security model –
user accountability is key today
Basicsystemandkernelsafeguards
Run Time
Environment
Application
Executable
Application
Generation
Application
Users
Unchanged
Binaries
Object Code
Modifications
Source Code
Modifications
Custom
Applications
Ideal Grid
Unix
LSF, PBS, SGE
Globus
Condor
Java, PCCs
distributed.net,
SETI@Home, etc
Access management (capability control)
Opportunities for subversion
18. © Nirav
Kapadia 18
Key takeaways
Today’s commercially available grid solutions
primarily target high throughput computing
Cluster management systems and CPU-scavenging
grids are the most common
Carefully consider the policy implications of
grids in terms of access and data management
More of a concern for grids that span sub-nets or fire
walls
19. © Nirav
Kapadia 19
Outline
Characterizing computing grids
Grids as intended versus what we see today
Common types of grids today
Putting computing grids to work
Types of problems addressed by today’s grids
Operational considerations in deploying a grid
A perspective on the future of IT infrastructure
Cost pressures and technology commoditization
Grid and utility computing: the technology enablers
20. © Nirav
Kapadia 20
Even as grids take hold, the
IT landscape is changing rapidly…
Technology is rapidly being
commoditized
Businesses are more willing
and able to shop for IT services
In-house IT infrastructure is
increasingly seen as complex
and rigid
© Harvard Business Review
21. © Nirav
Kapadia 21
IT infrastructure is already a
commodity from a business view
Outsourcing is pervasive; and standards-
based, open systems are increasingly common
Cost pressures will continue driving businesses to
streamline IT infrastructure
More often than not, customized in-house IT
systems stand out for their cost and complexity
Common off-the-shelf solutions provide more value
in the absence of direct competitive advantage
22. © Nirav
Kapadia 22
In time, economics will drive IT
infrastructure out of the enterprise
The technology enablers for this paradigm exist
today, but are still nascent
(True) grids offer a way to manage computing
resources across organizational boundaries
Utility computing solutions bring together grids, data
center automation, and virtualization
23. © Nirav
Kapadia 23
The technology implications of these
changes are enormous
Computing infrastructure needs to become
transparent to end users
Users only interact with applications and data
Policy management needs to be decoupled
from system management
Cannot assume users can be held accountable
Components of computing systems need to be
less tightly coupled
CPU, OS, data, apps may all be in different, remote
locations
24. © Nirav
Kapadia 24
A utility computing test bed at Purdue
showcases this paradigm
Operating since 1995; now a joint development
effort between Purdue and U of Florida
By 2001, allowed 3,000+ users from 30 countries to
run ~100 applications in a utility environment
Extensively validated: ~400,000 runs (by 2001);
highly peaked usage profile
Powers online simulations in the nanoHUB.org
portal for the nanotechnology community
25. © Nirav
Kapadia 25
nanoHUB.org – remote access to
simulators and compute power
Cluster
TeraGrid
Condor-G
Globus
Condor-G
Globus
Internet
nanoHUB infrastructure
nanoHUB.org
Web site
Physical Machine
Virtual Machine
NMI Cluster
Slide courtesy of Gerhard Klimeck, Network for Computational Nanotechnology
Remote desktop (VNC)
Real users and real usage
>10,687 users
27. © Nirav
Kapadia 27
In conclusion…
Today’s commercially available grids provide a
valuable but narrow service
More efficient computing in a closed environment;
limited support for cross-organizational sharing
In time, grid and utility computing technologies
will move IT infrastructure out of the enterprise
Virtualization and data center automation products
are visible precursors