1. GRID AND CLOUD COMPUTING
Introduction to Grid and Cloud Computing Courtesy:
Dr Gnanasekaran
Thangavel
https://web.uettaxila.edu.pk/CMS/2022/SPR2022/teGNCCms/index.asp
2. UNIT I INTRODUCTION
Evolution of Distributed computing:
– Scalable computing over the Internet
– Technologies for network-based systems
– Clusters of cooperative computers
– Grid computing Infrastructures
– Cloud computing
2
3/28/2024
5. Reference Book
3/28/2024 5
Authors: Judith Hurwitz, Robin
Bloor, Marcia Kaufman, Fern
Halper
Publisher: John Wiley & Sons,
- 2010
- 339 pages
6. Distributed Computing
Definition:
“A distributed system consists of multiple autonomous
computers that communicate through a computer network.
“Distributed computing utilizes a network of many computers,
each accomplishing a portion of an overall task, to achieve a
computational result much more quickly than with a single
computer.”
“Distributed computing is type of computing that involves multiple
computers; remote from each other with each having a role in a
computation problem or information processing.”
6
3/28/2024
7. Introduction
• A distributed system is one in which hardware or
software components located at networked
computers communicate and coordinate their actions
only by message passing.
• In the term distributed computing, the word distributed
means spread out across space. Thus, distributed
computing is an activity performed on a spatially
distributed system.
• These networked computers may be in the same room,
same campus, same country, or in different continents
7
3/28/2024
9. Motivation
• Inherently distributed applications
• Performance/cost
• Resource sharing
• Flexibility and extensibility
• Availability and fault tolerance
• Scalability
• Network connectivity is increasing.
• Combination of cheap processors often more cost-effective than one
expensive fast system.
• Potential increase of reliability.
9
3/28/2024
10. History
• 1975 -1985
–Parallel computing was favored in the early years
–Primarily vector-based at first
–Gradually more thread-based parallelism was introduced
–The first distributed computing programs were a pair of programs called
Creeper and Reaper invented in 1970s
–Ethernet that was invented in 1970s.
–ARPANET e-mail was invented in the early 1970s and probably the
earliest example of a large-scale distributed application.
10
3/28/2024
11. History
• 1985 -1995
– Massively parallel architectures start rising and message passing interface and
other libraries developed
– Bandwidth was a big problem
– The first Internet-based distributed computing project was started in 1988 by the
DEC System Research Center.
– Distributed.net was a project founded in 1997 - considered the first to use the
internet to distribute data for calculation and collect the results,
11
3/28/2024
12. History
• 1995 – Today
– Cluster/grid architecture increasingly dominant
– Special node machines were avoided in favor of COTS technologies
– Web-wide cluster software
– Google take this to the extreme (thousands of nodes/cluster)
– SETI@Home started in May 1999 - analyze the radio signals that were being
collected by the Arecibo Radio Telescope in Puerto Rico.
12
3/28/2024
13. Goal
• Making Resources Accessible
– Data sharing and device sharing
• Distribution Transparency
– Access, location, migration, relocation, replication, concurrency, failure
• Communication
– Make human-to-human comm. easier. e.g.. : electronic mail
• Flexibility
– Spread the workload over the available machines in the most cost-
effective way
• To coordinate the use of shared resources
• To solve large computational problem
13
3/28/2024
15. Distributed Computing Architecture
• Client-server
• 3-tier architecture
• N-tier architecture
• loose coupling, or tight coupling
• Peer-to-peer
• Space based
15
3/28/2024
16. • Examples of commercial application :
– Database Management System
– Distributed computing using mobile agents
– Local intranet
– Internet (World Wide Web)
– JAVA Remote Method Invocation (RMI)
Application of Distributed Systems
16
3/28/2024
17. Distributed Computing Using Mobile Agents
• Mobile agents can be wandering around in a network using free
resources for their own computations.
17
3/28/2024
18. Local Intranet
• A portion of Internet that is separately administered & supports internal sharing of
resources (file/storage systems and printers) using Internet Protocols is called local
intranet.
18
3/28/2024
19. Internet
• The Internet is a global system of interconnected computer networks that use the standardized Internet
Protocol Suite (TCP/IP).
19
3/28/2024
20. JAVA RMI
• Embedded in language Java:-
– Object variant of remote procedure call
– Adds naming compared with RPC (Remote Procedure Call)
– Restricted to Java environments
RMI Architecture
20
3/28/2024
21. Advantages
• Economics:-
– Computers harnessed together give a better price/performance ratio than
mainframes.
• Speed:-
– A distributed system may have more total computing power than a mainframe.
• Inherent distribution of applications:-
– Some applications are inherently distributed. e.g., an ATM-banking application.
• Reliability:-
– If one machine crashes, the system as a whole can still survive if you have
multiple server machines and multiple storage devices (redundancy).
• Extensibility and Incremental Growth:-
– Possible to gradually scale up (in terms of processing power and functionality)
by adding more resources (both hardware and software).
This can be done without disruption to the rest of the system.
21
3/28/2024
22. Disadvantages
• Complexity :-
– Lack of experience in designing and implementing a distributed system. e.g.,
which platform (hardware and OS) to use, which language to use etc.
• Network problem:-
– If the network underlying a distributed system saturates or goes down, then
the distributed system will be effectively disabled thus negating most of the
advantages of the distributed system.
• Security:-
– Security is a major hazard since easy access to data means easy access to
secret data as well.
22
3/28/2024
23. Issues and Challenges
• Heterogeneity of components :-
– Variety or differences that apply to computer hardware, network,
OS, programming language and implementations by different
developers.
– All differences in representation must be dealt with to do message
exchange.
– Example : Different calls for exchange of messages in UNIX is
different from Windows.
• Openness:-
– System can be extended and re-implemented in various ways.
– Cannot be achieved unless the specification and documentation
are made available to software developer.
– The most challenge to designer is to tackle the complexity of
distributed system; design by different people.
23
3/28/2024
24. Issues and Challenges cont…
• Transparency:-
– Aim : make certain aspects of distribution invisible to the
application programmer; focus on design of their particular
application.
– They are not concerned about the locations and details of how it
operates, either replicated or migrated.
– Failures can be presented to application programmers in the form
of exceptions – that must be handled.
24
3/28/2024
25. Issues and Challenges cont…
• Transparency:-
– This concept can be summarize as shown in this Figure:
25
3/28/2024
26. Issues and Challenges cont…
• Security:-
– Security for information resources in distributed system have 3
components :
a. Confidentiality : protection against disclosure to
unauthorized individuals.
b. Integrity : protection against alteration/corruption
c. Availability : protection against interference with the means
to access the resources.
– The challenge is to send sensitive information over Internet in a
secure manner and to identify a remote user or other agent
correctly.
26
3/28/2024
27. Issues and Challenges cont..
• Scalability :-
– Distributed computing operates at many different scales, ranging
from small Intranet to Internet.
– A system is scalable if there is significant increase in the number
of resources and users.
– The challenges is :
a. controlling the cost of physical resources.
b. controlling the performance loss.
c. preventing software resource running out.
d. avoiding performance bottlenecks.
27
3/28/2024
28. Issues and Challenges cont…
• Failure Handling :-
– Failures in a distributed system are partial – some
components fail while others can function.
– That’s why handling the failures are difficult:
a. Detecting failures : to manage the presence of failures
cannot be detected but may be suspected.
b. Masking failures : hiding failure not guaranteed in the worst
case.
• Concurrency :-
– Where applications/services process concurrency, it will affect
a conflict in operations with one another and produce
inconsistence results.
– Each resource must be designed to be safe in a concurrent
environment.
28
3/28/2024
29. Conclusion
• The concept of distributed computing is the most efficient way to
achieve the optimization.
• Distributed computing is anywhere : intranet, Internet or mobile
ubiquitous computing (laptop, PDAs, pagers, smart watches, hi-fi
systems).
• It deals with hardware and software systems, that contain more
than one processing / storage and run in concurrently.
• Main motivation factor is resource sharing; such as files , printers,
web pages or database records.
• Grid computing and Cloud computing are forms of distributed
computing.
29
3/28/2024
30. Grid Computing
Grid computing is a form of distributed computing whereby a
"super and virtual computer" is composed of a cluster of
networked, loosely coupled computers, acting in concert to
perform very large tasks.
Grid computing (Foster and Kesselman, 1999) is a growing
technology that facilitates the executions of large-scale resource
intensive applications on geographically distributed computing
resources.
Facilitates flexible, secure, coordinated large scale resource
sharing among dynamic collections of individuals, institutions,
and resource.
Enable communities (“virtual organizations”) to share
geographically distributed resources as they pursue common
goals.
3/28/2024 30
31. Criteria for a Grid:
• Coordinates resources that are not subject to centralized control
• Uses standard, open, general-purpose protocols and interfaces.
• Delivers nontrivial qualities of service
Benefits
• Exploit Underutilized resources
• Resource load Balancing
• Virtualize resources across an enterprise
• Data Grids, Compute Grids
• Enable collaboration for virtual organizations
31
3/28/2024
32. Grid Applications
Data and computationally intensive applications:
This technology has been applied to computationally-intensive scientific,
mathematical, and academic problems like drug discovery, economic
forecasting, seismic analysis Backoffice data processing in support of e-
commerce
• A chemist may utilize hundreds of processors to screen thousands of
compounds per hour.
• Teams of engineers worldwide pool resources to analyze terabytes of structural
data.
• Meteorologists seek to visualize and analyze petabytes of climate data with
enormous computational demands.
Resource sharing
– Computers, storage, sensors, networks, …
– Sharing always conditional: issues of trust, policy, negotiation, payment, …
Coordinated problem solving
– distributed data analysis, computation, collaboration, …
3/28/2024 32
33. Grid Topologies
• Intragrid
– Local grid within an organization
– Trust based on personal contracts
• Extragrid
– Resources of a consortium of organizations
connected through a (Virtual) Private Network
– Trust based on Business to Business contracts
• Intergrid
– Global sharing of resources through the internet
– Trust based on certification
3/28/2024 33
34. 3/28/2024 34
Computational Grid
“A computational grid is a hardware and software infrastructure that
provides dependable, consistent, pervasive, and inexpensive access to
high-end computational capabilities.”
”The Grid: Blueprint for a New Computing Infrastructure”, Kesselman & Foster
Example : Science Grid (US Department of Energy)
35. Data Grid
• A data grid is a grid computing system that deals with data — the
controlled sharing and management of large amounts of distributed
data.
• Data Grid is the storage component of a grid environment. Scientific
and engineering applications require access to large amounts of data,
and often this data is widely distributed. A data grid provides seamless
access to the local or remote data required to complete compute
intensive calculations.
Example :
Biomedical informatics Research Network (BIRN),
the Southern California Earthquake Center (SCEC).
3/28/2024 35
37. Distributed Supercomputing
• Combining multiple high-capacity resources on a
computational grid into a single, virtual distributed
supercomputer.
• Tackle problems that cannot be solved on a single system.
3/28/2024 37
38. High-Throughput Computing
• Uses the grid to schedule large numbers of loosely coupled or
independent tasks, with the goal of putting unused processor
cycles to work.
On-Demand Computing
• Uses grid capabilities to meet short-term requirements for
resources that are not locally accessible.
• Models real-time computing demands.
3/28/2024 38
39. Collaborative Computing
• Concerned primarily with enabling and enhancing human-to-human
interactions.
• Applications are often structured in terms of a virtual shared space.
Data-Intensive Computing
• The focus is on synthesizing new information from data that
is maintained in geographically distributed repositories,
digital libraries, and databases.
• Particularly useful for distributed data mining.
3/28/2024 39
40. Logistical Networking
• Logistical networks focus on exposing storage resources
inside networks by optimizing the global scheduling of data
transport, and data storage.
• Contrasts with traditional networking, which does not
explicitly model storage resources in the network.
• high-level services for Grid applications
• Called "logistical" because of the analogy it bears with the
systems of warehouses, depots, and distribution channels.
3/28/2024 40
41. P2P Computing vs Grid Computing
• Differ in Target Communities
• Grid system deals with more complex, more powerful, more
diverse and highly interconnected set of resources than
P2P
• P2P uses heterogeneous end user devices for resource
sharing to fulfill the application requirements.
• Business logic and data is distributed among end user
nodes for P2P applications.
3/28/2024 41
42. A typical view of Grid environment
User Resource Broker
Grid Resources
Grid Information Service
2. A User sends computation or data
intensive application to Global Grids in
order to speed up the execution of the
application.
3. A Resource Broker distribute the jobs in an
application to the Grid resources based on user’s
QoS requirements and details of available Grid
resources for further executions.
4. Grid Resources (Cluster, PC,
Supercomputer, database, instruments, etc.) in
the Global Grid execute the user jobs.
1. Grid Information Service system
collects the details of the available Grid
resources and passes the information
to the resource broker.
Computation result
Grid application
Computational jobs
Details of Grid resources
Processed jobs
1
2
3
4
42
3/28/2024
43. Grid Middleware
• Grids are typically managed by grid ware - a special type of middleware that enable sharing and
manage grid components based on user requirements and resource attributes (e.g., capacity,
performance)
• Software that connects other software components or applications to provide the following functions:
Run applications on suitable available resources
– Brokering, Scheduling
Provide uniform, high-level access to resources
Address inter-domain issues of security, policy, etc.
– Federated Identities
Provide application-level status monitoring and control
3/28/2024 43
44. Middleware
• Globus – Chicago Univ
• Condor – Wisconsin Uni – High throughput computing
• Legion – Virginia Univ – Virtual workspaces - Collaborative
computing
• IBP – Internet back plane – Tennesse Univ – logistical networking
• NetSolve – solving scientific problems in heterogeneous env –
high throughput & data intensive
3/28/2024 44
45. Two Key Grid Computing Groups
The Globus Alliance (www.globus.org)
• Composed of people from:
Argonne National Labs, University of Chicago, University of Southern California Information
Sciences Institute, University of Edinburgh and others.
• OGSA/I standards initially proposed by the Globus Group
The Global Grid Forum (www.ggf.org)
• Heavy involvement of Academic Groups and Industry
– (e.g. IBM Grid Computing, HP, United Devices, Oracle, UK e-Science Programme, US DOE,
US NSF, Indiana University, and many others)
• Process
– Meets three times annually
– Solicits involvement from industry, research groups, and academics
3/28/2024 45
46. Some of the Major Grid Projects
Name URL/Sponsor Focus
EuroGrid, Grid
Interoperability (GRIP)
eurogrid.org
European Union
Create tech for remote access to super comp resources
& simulation codes; in GRIP, integrate with Globus
Toolkit™
Fusion Collaboratory fusiongrid.org
DOE Off. Science
Create a national computational collaboratory for fusion
research
Globus Project™ globus.org
DARPA, DOE, NSF,
NASA, Msoft
Research on Grid technologies; development and
support of Globus Toolkit™; application and deployment
GridLab gridlab.org
European Union
Grid technologies and applications
GridPP gridpp.ac.uk
U.K. eScience
Create & apply an operational grid within the U.K. for
particle physics research
Grid Research Integration
Dev. & Support Center
grids-center.org
NSF
Integration, deployment, support of the NSF
Middleware Infrastructure for research & education
3/28/2024 46
47. Cloud Computing
• Cloud computing refers to applications and services that run on a
distributed network using virtualized resources and accessed by
common Internet protocols and networking standards.
• It is distinguished by the notion that resources are virtual and
limitless and that details of the physical systems on which software
runs are abstracted from the user.
3/28/2024 47
48. Cloud Computing
• Cloud computing takes the technology, services, and applications that are similar
to those on the Internet and turns them into a self-service utility. The use of the
word “cloud” makes reference to the two essential concepts:
– Abstraction: Cloud computing abstracts the details of system implementation
from users and developers. Applications run on physical systems that aren't
specified, data is stored in locations that are unknown, administration of
systems is outsourced to others, and access by users is ubiquitous.
– Virtualization: Cloud computing virtualizes systems by pooling and sharing
resources. Systems and storage can be provisioned as needed from a
centralized infrastructure, costs are assessed on a metered basis, multi-
tenancy is enabled, and resources are scalable with agility.
3/28/2024 48
49. Cloud Computing
• Cloud computing is an abstraction based on the notion of pooling physical resources
and presenting them as a virtual resource. It is a new model for provisioning resources,
for staging applications, and for platform-independent user access to services.
• To help clarify how cloud computing has changed the nature of commercial system
deployment, consider these three examples:
– Google: In the last decade, Google has built a worldwide network of datacenters to service its
search engine. In doing so Google has captured a substantial portion of the world's advertising
revenue. That revenue has enabled Google to offer free software to users based on that
infrastructure and has changed the market for user-facing software. This is the classic Software
as a Service case.
– Azure Platform: By contrast, Microsoft is creating the Azure Platform. It enables .NET
Framework applications to run over the Internet as an alternate platform for Microsoft developer
software running on desktops.
– Amazon Web Services: One of the most successful cloud-based businesses is Amazon Web
Services, which is an Infrastructure as a Service offering that lets you rent virtual computers on
Amazon's own infrastructure.
3/28/2024 49