Distributed Computing:
 In the term distributed computing, the word distributed means spread out across space.
Thus, distributed computing is an activity performed on a distributed system.
 These networked computers may be in the same room, same campus, same country, or
in different country.
History:
 The use of concurrent processes that communicate by message-passing has its roots
in operating system architectures studied in the 1960s.
 The study of distributed computing became its own branch of computer science in the
late 1970s and early 1980s.
 The first conference in the field, Symposium on Principles of Distributed
Computing (PODC), dates back to 1982, and its European counterpart International
Symposium on Distributed Computing (DISC) was first held in 1985.
Introduction:
In distributed system each processor have its own memory. The computational entities are
called computers or nodes.
In distributed computing a program is split up into parts that run simultaneously on multiple
computers communicating over a network.
Distributed computing is a form of parallel computing.
Working Of Distributed System:
Fig. A Distributed System
Types Of Distributed Computing:
 Grid computing
Multiple independent computing clusters which act like a “grid” because they are
composed of resource nodes not located within a single administrative domain. (formal)
The creation of a “virtual supercomputer” by using spare computing resources within an
organization.
 Cloud computing
Cloud computing is a computing paradigm shift where computing is moved away from
personal computers or an individual application server to a “cloud” of computers. Users of the
cloud only need to be concerned with the computing service being asked for, as the underlying
details of how it is achieved are hidden. This method of distributed computing is done through
pooling all computer resources together and being managed by software rather than a human.
Motivation:
The main motivations in moving to a distributed system are the following:
 Inherently distributed applications.
 Performance/cost.
 Resource sharing.
 Flexibility and extensibility.
 Availability and fault tolerance.
 Scalability.
Goals:
 Making Resources Accessible
The main goal of a distributed system is to make it easy for the users (and applications) to
access remote resources, and to share them in a controlled and efficient way.
 Distribution Transparency
An important goal of a distributed system is to hide the fact that its processes and
resources are physically distributed across multiple computers.
 Openness
An open distributed system is a system that offers services according to standard rules
that describe the syntax and semantics of those services.
 scalability
Scalability of a system can be measured along at least three different dimensions.
Characteristics:
 Resource Sharing:- Resource sharing is the ability to use any hardware, software or data
anywhere in the system.
 Openness:- Openness is concerned with extensions and improvements of distributed
systems.
 Concurrency:- Concurrency arises naturally in distributed systems from the separate
activities of users, the independence of resources and the location of server processes in
separate computers.
Transparency:- Transparency hides the complexity of the distributed systems to the users and
application programmers
Architecture:
Examples of distributed systems:
Examples of distributed systems and applications of distributed computing include the following:
 Telecommunication networks:
 Telephone networks and cellular networks
 Computer networks such as the Internet
 Network applications:
 World wide web and peer-to-peer networks
 Massively multiplayer online games and virtual reality communities
 Real-time process control:
 Aircraft control systems
 Industrial control systems
 Parallel computation:
 Scientific computing, including cluster computing and grid computing and
various volunteer computing projects
 Distributed rendering in computer graphics
Advantages:
 Economics
 Speed
 Inherent distribution of applications
 Reliability
Disadvantages:
 Complexity
 Network problem
 Security
Parallel computing:
Serial computing:
 Traditionally, software has been written for serial computation:
– To be run on a single computer having a single Central Processing Unit (CPU);
– A problem is broken into a discrete series of instructions.
– Instructions are executed one after another.
– Only one instruction may execute at any moment in time.
Parallel computing:
 In the simplest sense, parallel computing is the simultaneous use of multiple compute
resources to solve a computational problem.
– To be run using multiple CPUs
– A problem is broken into discrete parts that can be solved concurrently
– Each part is further broken down to a series of instructions
 Instructions from each part execute simultaneously on different CPUs
Parallel Computing: Resources
 The compute resources can include:
– A single computer with multiple processors;
– A single computer with (multiple) processor(s) and some specialized computer
resources (GPU, FPGA …)
– An arbitrary number of computers connected by a network;
– A combination of both.
Parallel Computing: what for? :
 Parallel computing is an evolution of serial computing that attempts to emulate what has
always been the state of affairs in the natural world: many complex, interrelated events
happening at the same time, yet within a sequence.
 Some examples:
– Planetary and galactic orbits
– Weather and ocean patterns
– Tectonic plate drift
– Rush hour traffic in Paris
– Automobile assembly line
Flynn's Classical Taxonomy:
• There are different ways to classify parallel computers. One of the more widely used
classifications, in use since 1966, is called Flynn's Taxonomy.
• Flynn's taxonomy distinguishes multi-processor computer architectures according to how
they can be classified along the two independent dimensions of Instruction and Data.
Each of these dimensions can have only one of two possible states: Single or Multiple.
• The matrix below defines the 4 possible classifications according to Flynn.
Single Instruction, Single Data (SISD):
• A serial (non-parallel) computer
• Single instruction: only one instruction
stream is being acted on by the CPU
during any one clock cycle
• Single data: only one data stream is being used as input during any one clock cycle
Deterministic execution
• This is the oldest and until recently, the most prevalent form of computer
• Eg: most PCs, single CPU workstations and mainframes
Single Instruction, Multiple Data (SIMD):
• A type of parallel computer
• Single instruction: All processing units execute the same instruction at any given clock
cycle
• Multiple data: Each processing unit can operate on a different data element
• This type of machine typically has an instruction dispatcher, a very high-bandwidth
internal network, and a very large array of very small-capacity instruction units.
Multiple Instruction, Single Data (MISD):
• Few actual examples of this class of parallel computer have ever existed
• Some conceivable examples might be:
– multiple frequency filters operating on a single signal stream
– multiple cryptography algorithms attempting to crack a single coded message.
Multiple Instruction, Multiple Data (MIMD):
• Currently, the most common type of parallel computer
• Multiple Instruction: every processor may be executing a different instruction stream
• Multiple Data: every processor may be working with a different data stream
• Execution can be synchronous or asynchronous, deterministic or non- deterministic
Advantages of Parallel Computing
 Provide Concurrency(do multiple things at the same time)
 Taking advantage of non-local resources
 Cost Savings
 Overcoming memory constraints
 Save time and money
Disadvantages of Parallel Computing
 Primary disadvantage is the lack of scalability between memory and CPUs.
Programmer responsibility for synchronization constructs that ensure "correct" access of
global memory
GRID COMPUTING
• Grid is a shared collection of reliable (cluster-tightly
coupled) & unreliable resources (loosely coupled
machines) and interactively communicating
researchers of different virtual organisations
How Grid computing works ?
Grid Architecture
Working of layers
 Fabric. The lowest layer job is used to make a common interface on all possible kinds of
resources available. Access by higher layers is granted via standardized processes.
 Resource and connectivity protocols: The connectivity layer defines the basic
communication- and authentication protocols which are needed by the grid. While the
communication protocols allow the exchange of files between different resources
connected by the first layer, the authentication protocols allow to communicate
confidentially and to ensure the identity of the two partners.
 Collective services: The purpose of this layer is the coordination of multiple resources.
Access to these resources doesn’t happen directly but merely via the underlying protocols
and interfaces.
 User applications: To this layer belong all those applications which are operating in the
environment of a virtual organization. Jobs of the lower layers get called by applications
and can use resources transparently.
Advantages of Grid Computing
Disadvantages of Grid Computing
• Resource sharing is further complicated when grid is introduced as a solution for utility
computing where commercial applications and resources become available as shareable
and on demand resources
• Some applications may need to be tweaked to take full advantage of the new model.
Cluster Computing:
 A computer cluster is a group of loosely coupled computers that work together closely so
that in many respects it can be viewed as though it were a single computer.
 Clusters are commonly connected through fast local area networks.
 In cluster computing each node within a cluster is an independent system, with its own
operating system, private memory, and, in some cases, its own file system.
 Because the processors on one node cannot directly access the memory on the other
nodes, programs or software run on clusters usually employ a procedure called "message
passing" to get data and execution code from one node to another.
 Cluster computing can also be used as a relatively low-cost form of parallel processing for
scientific and other applications that lend themselves to parallel operations.
Types of Cluster :
1. High Availability or Failover Clusters
2. Load Balancing Cluster
3. Parallel/Distributed Processing Clusters
High Availability or Failover Clusters:
• These clusters are designed to provide uninterrupted availability of data or services
(typically web services) to the end-user community
• if a node fails, the service can be restored without affecting the availability of the services
provided by the cluster
• The purpose of these clusters is to ensure that a single instance of an application is only
ever running on one cluster member at a time but if and when that cluster member is no
longer available, the application will failover to another cluster member
Load Balancing Cluster
• This type of cluster distributes incoming requests for resources or content among multiple
nodes running the same programs or having the same content.
• Both the high availability and load-balancing cluster technologies can be combined to
increase the reliability, availability, and scalability of application and data resources that
are widely deployed for web, mail, news, or FTP services.
• Every node in the cluster is able to handle requests for the same content or application.
• This type of distribution is typically seen in a web-hosting environment.
Load Balancing Cluster Diagram:
Parallel/Distributed Processing Clusters
• parallel processing was performed by multiple processors in a specially designed
parallel computer. These are systems in which multiple processors share a single
memory and bus interface within a single computer.
• These types of cluster increase availability, performance, and scalability for
applications, particularly computationally or data intensive tasks.
Cluster Component:
• The basic building blocks of clusters are broken down into multiple categories:
a. Cluster Nodes
b. Cluster Network
c. Network Characterization
Cluster Application
 There are three primary categories of applications that use parallel clusters:
1. Compute Intensive Application.
2. Data or I/O Intensive Applications.
3. Transaction Intensive Applications.
Peer to peer Computing:

Computing notes

  • 1.
    Distributed Computing:  Inthe term distributed computing, the word distributed means spread out across space. Thus, distributed computing is an activity performed on a distributed system.  These networked computers may be in the same room, same campus, same country, or in different country. History:  The use of concurrent processes that communicate by message-passing has its roots in operating system architectures studied in the 1960s.  The study of distributed computing became its own branch of computer science in the late 1970s and early 1980s.  The first conference in the field, Symposium on Principles of Distributed Computing (PODC), dates back to 1982, and its European counterpart International Symposium on Distributed Computing (DISC) was first held in 1985. Introduction:
  • 2.
    In distributed systemeach processor have its own memory. The computational entities are called computers or nodes. In distributed computing a program is split up into parts that run simultaneously on multiple computers communicating over a network. Distributed computing is a form of parallel computing. Working Of Distributed System: Fig. A Distributed System Types Of Distributed Computing:  Grid computing
  • 3.
    Multiple independent computingclusters which act like a “grid” because they are composed of resource nodes not located within a single administrative domain. (formal) The creation of a “virtual supercomputer” by using spare computing resources within an organization.  Cloud computing Cloud computing is a computing paradigm shift where computing is moved away from personal computers or an individual application server to a “cloud” of computers. Users of the cloud only need to be concerned with the computing service being asked for, as the underlying details of how it is achieved are hidden. This method of distributed computing is done through pooling all computer resources together and being managed by software rather than a human. Motivation: The main motivations in moving to a distributed system are the following:  Inherently distributed applications.  Performance/cost.  Resource sharing.  Flexibility and extensibility.  Availability and fault tolerance.  Scalability. Goals:  Making Resources Accessible The main goal of a distributed system is to make it easy for the users (and applications) to access remote resources, and to share them in a controlled and efficient way.  Distribution Transparency
  • 4.
    An important goalof a distributed system is to hide the fact that its processes and resources are physically distributed across multiple computers.  Openness An open distributed system is a system that offers services according to standard rules that describe the syntax and semantics of those services.  scalability Scalability of a system can be measured along at least three different dimensions. Characteristics:  Resource Sharing:- Resource sharing is the ability to use any hardware, software or data anywhere in the system.  Openness:- Openness is concerned with extensions and improvements of distributed systems.  Concurrency:- Concurrency arises naturally in distributed systems from the separate activities of users, the independence of resources and the location of server processes in separate computers. Transparency:- Transparency hides the complexity of the distributed systems to the users and application programmers Architecture:
  • 5.
    Examples of distributedsystems: Examples of distributed systems and applications of distributed computing include the following:  Telecommunication networks:  Telephone networks and cellular networks  Computer networks such as the Internet  Network applications:  World wide web and peer-to-peer networks  Massively multiplayer online games and virtual reality communities  Real-time process control:
  • 6.
     Aircraft controlsystems  Industrial control systems  Parallel computation:  Scientific computing, including cluster computing and grid computing and various volunteer computing projects  Distributed rendering in computer graphics Advantages:  Economics  Speed  Inherent distribution of applications  Reliability Disadvantages:  Complexity  Network problem  Security
  • 7.
    Parallel computing: Serial computing: Traditionally, software has been written for serial computation: – To be run on a single computer having a single Central Processing Unit (CPU); – A problem is broken into a discrete series of instructions. – Instructions are executed one after another. – Only one instruction may execute at any moment in time.
  • 8.
    Parallel computing:  Inthe simplest sense, parallel computing is the simultaneous use of multiple compute resources to solve a computational problem. – To be run using multiple CPUs – A problem is broken into discrete parts that can be solved concurrently – Each part is further broken down to a series of instructions  Instructions from each part execute simultaneously on different CPUs Parallel Computing: Resources  The compute resources can include:
  • 9.
    – A singlecomputer with multiple processors; – A single computer with (multiple) processor(s) and some specialized computer resources (GPU, FPGA …) – An arbitrary number of computers connected by a network; – A combination of both. Parallel Computing: what for? :  Parallel computing is an evolution of serial computing that attempts to emulate what has always been the state of affairs in the natural world: many complex, interrelated events happening at the same time, yet within a sequence.  Some examples: – Planetary and galactic orbits – Weather and ocean patterns – Tectonic plate drift – Rush hour traffic in Paris – Automobile assembly line Flynn's Classical Taxonomy: • There are different ways to classify parallel computers. One of the more widely used classifications, in use since 1966, is called Flynn's Taxonomy.
  • 10.
    • Flynn's taxonomydistinguishes multi-processor computer architectures according to how they can be classified along the two independent dimensions of Instruction and Data. Each of these dimensions can have only one of two possible states: Single or Multiple. • The matrix below defines the 4 possible classifications according to Flynn. Single Instruction, Single Data (SISD): • A serial (non-parallel) computer • Single instruction: only one instruction stream is being acted on by the CPU during any one clock cycle • Single data: only one data stream is being used as input during any one clock cycle Deterministic execution • This is the oldest and until recently, the most prevalent form of computer • Eg: most PCs, single CPU workstations and mainframes Single Instruction, Multiple Data (SIMD): • A type of parallel computer • Single instruction: All processing units execute the same instruction at any given clock cycle • Multiple data: Each processing unit can operate on a different data element
  • 11.
    • This typeof machine typically has an instruction dispatcher, a very high-bandwidth internal network, and a very large array of very small-capacity instruction units. Multiple Instruction, Single Data (MISD): • Few actual examples of this class of parallel computer have ever existed • Some conceivable examples might be: – multiple frequency filters operating on a single signal stream – multiple cryptography algorithms attempting to crack a single coded message. Multiple Instruction, Multiple Data (MIMD): • Currently, the most common type of parallel computer • Multiple Instruction: every processor may be executing a different instruction stream • Multiple Data: every processor may be working with a different data stream • Execution can be synchronous or asynchronous, deterministic or non- deterministic
  • 12.
    Advantages of ParallelComputing  Provide Concurrency(do multiple things at the same time)  Taking advantage of non-local resources  Cost Savings  Overcoming memory constraints  Save time and money Disadvantages of Parallel Computing  Primary disadvantage is the lack of scalability between memory and CPUs. Programmer responsibility for synchronization constructs that ensure "correct" access of global memory GRID COMPUTING • Grid is a shared collection of reliable (cluster-tightly coupled) & unreliable resources (loosely coupled machines) and interactively communicating researchers of different virtual organisations How Grid computing works ? Grid Architecture
  • 13.
    Working of layers Fabric. The lowest layer job is used to make a common interface on all possible kinds of resources available. Access by higher layers is granted via standardized processes.  Resource and connectivity protocols: The connectivity layer defines the basic communication- and authentication protocols which are needed by the grid. While the communication protocols allow the exchange of files between different resources connected by the first layer, the authentication protocols allow to communicate confidentially and to ensure the identity of the two partners.  Collective services: The purpose of this layer is the coordination of multiple resources. Access to these resources doesn’t happen directly but merely via the underlying protocols and interfaces.  User applications: To this layer belong all those applications which are operating in the environment of a virtual organization. Jobs of the lower layers get called by applications and can use resources transparently. Advantages of Grid Computing Disadvantages of Grid Computing • Resource sharing is further complicated when grid is introduced as a solution for utility computing where commercial applications and resources become available as shareable and on demand resources • Some applications may need to be tweaked to take full advantage of the new model.
  • 14.
    Cluster Computing:  Acomputer cluster is a group of loosely coupled computers that work together closely so that in many respects it can be viewed as though it were a single computer.  Clusters are commonly connected through fast local area networks.  In cluster computing each node within a cluster is an independent system, with its own operating system, private memory, and, in some cases, its own file system.  Because the processors on one node cannot directly access the memory on the other nodes, programs or software run on clusters usually employ a procedure called "message passing" to get data and execution code from one node to another.  Cluster computing can also be used as a relatively low-cost form of parallel processing for scientific and other applications that lend themselves to parallel operations. Types of Cluster : 1. High Availability or Failover Clusters 2. Load Balancing Cluster 3. Parallel/Distributed Processing Clusters
  • 15.
    High Availability orFailover Clusters: • These clusters are designed to provide uninterrupted availability of data or services (typically web services) to the end-user community • if a node fails, the service can be restored without affecting the availability of the services provided by the cluster • The purpose of these clusters is to ensure that a single instance of an application is only ever running on one cluster member at a time but if and when that cluster member is no longer available, the application will failover to another cluster member Load Balancing Cluster • This type of cluster distributes incoming requests for resources or content among multiple nodes running the same programs or having the same content. • Both the high availability and load-balancing cluster technologies can be combined to increase the reliability, availability, and scalability of application and data resources that are widely deployed for web, mail, news, or FTP services. • Every node in the cluster is able to handle requests for the same content or application.
  • 16.
    • This typeof distribution is typically seen in a web-hosting environment. Load Balancing Cluster Diagram: Parallel/Distributed Processing Clusters • parallel processing was performed by multiple processors in a specially designed parallel computer. These are systems in which multiple processors share a single memory and bus interface within a single computer.
  • 17.
    • These typesof cluster increase availability, performance, and scalability for applications, particularly computationally or data intensive tasks. Cluster Component: • The basic building blocks of clusters are broken down into multiple categories: a. Cluster Nodes b. Cluster Network c. Network Characterization Cluster Application  There are three primary categories of applications that use parallel clusters: 1. Compute Intensive Application. 2. Data or I/O Intensive Applications. 3. Transaction Intensive Applications.
  • 18.
    Peer to peerComputing: