CS3551 - DISTRIBUTED COMPUTING
UNIT I INTRODUCTION 8
Introduction: Definition-Relation to Computer System Components – Motivation – Message -Passing Systems versus Shared Memory Systems – Primitives
for Distributed Communication – Synchronous versus Asynchronous Executions – Design Issues and Challenges; A Model of Distributed Computations: A
Distributed Program – A Model of Distributed Executions – Models of Communication Networks – Global State of a Distributed System.
UNIT II LOGICAL TIME AND GLOBAL STATE 10
Logical Time: Physical Clock Synchronization: NTP – A Framework for a System of Logical Clocks – Scalar Time – Vector Time; Message Ordering and Group
Communication: Message Ordering Paradigms -Asynchronous Execution with Synchronous Communication – Synchronous Program Order on Asynchronous
System – Group Communication – Causal Order – Total Order; Global State and Snapshot Recording Algorithms: Introduction – System Model and Definitions
– Snapshot Algorithms for FIFO Channels.
UNIT III DISTRIBUTED MUTEX AND DEADLOCK 10
Distributed Mutual exclusion Algorithms: Introduction – Preliminaries – Lamport’s algorithm – RicartAgrawala’s Algorithm –– Token-Based Algorithms –
Suzuki-Kasami’s Broadcast Algorithm; Deadlock Detection in Distributed Systems: Introduction – System Model – Preliminaries – Models of Deadlocks –
Chandy-Misra-Haas Algorithm for the AND model and OR Model.
UNIT IV CONSENSUS AND RECOVERY 10
Consensus and Agreement Algorithms: Problem Definition – Overview of Results – Agreement in a Failure-Free System(Synchronous and Asynchronous) –
Agreement in Synchronous Systems with Failures; Checkpointing and Rollback Recovery: Introduction – Background and Definitions – Issues in Failure
Recovery – Checkpoint-based Recovery – Coordinated Checkpointing Algorithm - - Algorithm for Asynchronous Checkpointing and Recovery
UNIT V CLOUD COMPUTING 7
Definition of Cloud Computing – Characteristics of Cloud – Cloud Deployment Models – Cloud Service Models – Driving Factors and Challenges of Cloud –
Virtualization – Load Balancing – Scalability and Elasticity – Replication – Monitoring – Cloud Services and Platforms: Compute Services – Storage Services –
Application Services
CS3551 - DISTRIBUTED COMPUTING
TEXT BOOKS
1. Kshemkalyani Ajay D, Mukesh Singhal, “Distributed Computing: Principles, Algorithms and
Systems”, Cambridge Press, 2011.
2. Mukesh Singhal, Niranjan G Shivaratri, “Advanced Concepts in Operating systems”, Mc-Graw Hill
Publishers, 1994.
REFERENCES
1. George Coulouris, Jean Dollimore, Time Kindberg, “Distributed Systems Concepts and Design”, Fifth
Edition, Pearson Education, 2012.
2. Pradeep L Sinha, “Distributed Operating Systems: Concepts and Design”, Prentice Hall of India, 2007.
3. Tanenbaum A S, Van Steen M, “Distributed Systems: Principles and Paradigms”, Pearson Education, 2007.
4. Liu M L, “Distributed Computing: Principles and Applications”, Pearson Education, 2004.
5. Nancy A Lynch, “Distributed Algorithms”, Morgan Kaufman Publishers, 2003.
6. Arshdeep Bagga, Vijay Madisetti, “ Cloud Computing: A Hands-On Approach”, Universities Press,
2014.
•
UNIT I – INTRODUCTION
Introduction
 Father : Le Sile B.Lamport
 Sister concern of OS
 OS and DC
 Need for Distributed computing
 - Efficiency
 - Fault Tolerance
 - Scalability
 Distributed computing and parallel computing
 Distributed computing and Distributed system
Definition and Characteristics
• Definition
A collection of computers said to be loosely coupled which has its own memory, runs its
own OS and communicates by message passing over a communication network
cooperating to address a problem collectively is termed to be Distributed computing.
• Characteristics
1. No common physical clock - independent start time
2. No shared memory - Direct or indirect communication through message passing
3. Geographical separation
- Resides at any location communicating through networking
- NOW / COW ( Network / Cluster of Workstation)
4. Autonomy and heterogeneity - Runs at different speed and on different OS
Motivation / Benefits
1)Inherently distributed computation
Distributed computing helps in distributed computation in the case of
communication between distant parties eg: money transfer in banking.
2) Resource sharing
• Datasets are distributed across several servers eg: Distributed database such as
DB2
• Replication also possible
3) Access to geographically remote data and resources
• Database - payroll data of an MNC
• Supercomputer - To be centralized
• Wireless technology
Motivation / Benefits
4) Enhanced reliability
• Availability - Resource must always be available
• Integrity - We must not get non relevant information
• Fault Tolerant - There must be a back up
5) Increased performance / cost ratio
• Overall performance is always better than normal computing with respect to
total cost
6) Scalability
• Bears with the load when number of users are increased
Motivation / Benefits
7) Modularity and incremental expandability
• Modularity - Configuration can be different in each node
• Incremental expandability – Nodes can be replaced or increased
Applications
 Applications
1) Finance and commerce - Amazon
2) Information society - Search engines
3) Cloud technologies - AWS
4) Entertainment - Online gaming, youtube
5) Healthcare - Online patient records
6) Education - E-learning
7) Transport and logistics - Google map
8) Environment management - Sensor technologies
Unique Drawbacks
1) Communication
• Communication does not come for free
• Limit on quantity of information sent
• Limit on the speed in which information is sent
• Cost of transmission is high
2) Incomplete Knowledge
• Each processor has only a limited knowledge of the systems and the
computations carried out by other processors ie it will not know the size
of input shared at other processor
Relation to computer system components
Distributed system connects processor by communication network
• Structure of DC holds various computers or nodes with processor and memory
for itself all connected via WAN / LAN.
• Communication network helps in information transfer between nodes.
Relation to computer system components
Interaction of software components at each processor
• Network protocol stack - http, mail, ftp and telnet
• Middleware - holds libraries required by distributed computing
- Egs: OMG - Object management group, CORBA - common object request
broker architecture, RPC - Remote procedure call , MPI – Message
passing interface
Relation to computer system components
Interaction of software components at each processor
• There is communication between network protocol stack, OS and middleware.
• Application layer functions will not be available in middleware
• Libraries will perform reliable and ordered multicasting
Message passing system versus shared memory system
Single processor system
Message passing - Common message queue
Shared memory - Common memory
Message passing system versus shared memory system
• Shared memory is used throughout the normal system
• Communication takes place through shared data variable and control variable
• Semaphores and monitors are used for achieving synchronization in shared
memory usage
• Distributed computing
Message passing called distributed shared memory is used for communication
in distributed computing
Message passing uses two primitives
send (dest_name , message)
receive ( source_name , message)
Emulating Message Passing on Shared Memory system
(MP  SP)
• Shared address space is divided into multiple parts
• Each part is assigned to different processor.
• Message passing uses two primitives
send (dest_name , message)
receive ( source_name , message)
• Shared memory uses two primitives : Read and Write
• Message passing primitives has to be emulated into shared memory primitives
Send is emulated as write ( separate location is reserved into partition as mailbox)
Receive is emulated as read
• Read and write operation has to be synchronized
Shared memory
p1 p2 p3
Emulating Shared Memory on Message Passing system
(SM  MP)
• Read is emulated as Receive and Write is emulated as Send
• Each shared location is modeled as a separate process
(a) Write to a shared location is emulated by sending an update message to
the corresponding owner process
(b) Read to a shared location is emulated by sending a query message to the
corresponding owner process
• While using shared memory emulation , latency involved in read and write
operation may be high.
Emulating Shared Memory on Message Passing system
(SM  MP)
• An application can use combination of shared memory and message passing.
• In MIMD (Multiple Instruction and Multiple Data) Message passing
multicomputer system
- multiprocessor system is used
- Into multiprocessor system processor communicates via shared memory
- communication between two computers is done using message passing
Difference between Shared Memory and Message Passing
Shared Memory Message Passing
It is mainly used for data communication. It is mainly used for communication.
It offers a maximum speed of
computation because communication is
completed via the shared memory, so the
system calls are only required to establish
the shared memory.
It takes a huge time because it is
performed via the kernel (system
calls).
The code for reading and writing the data
from the shared memory should be
written explicitly by the developer.
No such code is required in this case
because the message passing feature
offers a method for communication
and synchronization of activities
executed by the communicating
processes.
Difference between Shared Memory and Message Passing
Shared Memory Message Passing
It is used to communicate between the
single processor and multiprocessor
systems in which the processes to be
communicated are on the same machine
and share the same address space.
It is most commonly utilized in a
distributed setting when communicating
processes are spread over multiple
devices linked by a network.
It is a faster communication strategy
than the message passing.
It is a relatively slower communication
strategy than the shared memory.
Make sure that processes in shared
memory aren't writing to the same
address simultaneously.
It is useful for sharing little quantities of
data without causing disputes.
Primitives for Distributed Communication
• Message sending and receiving primitives : Send() and Receive()
• Parameters of Send() primitive :
(a) Destination
(b) Buffer in user space, containing the data to be sent.
• Two ways of sending data when Send() is invoked:
(a) Buffered - copies data from user buffer to kernel buffer and then from
kernel buffer to network
(b) UnBuffered – Data is directly copied from user buffer onto network
Primitives for Distributed Communication
• Parameters of Receive() primitive :
(a) Source from which the data is to be received
(b) User buffer into which the data is to be received
• ways of receiving data when Receive() is invoked:
(a) Buffered – since data may already have arrived when the primitive is
invoked and needs a storage place in kernel
Primitives for Distributed Communication
Types of primitives
1. Synchronous primitives ( Send/Receive - A cares for B and B cares for A )
• Handshake between sender and receiver
• Send completes when Receive completes
• Receive completes when data copied into buffer
[[
2. Asynchronous primitives ( Send - A cares only about itself )
A Send primitive is said to be asynchronous if control returns back to the invoking process
after the data item to be sent has been copied out of the user-specified buffer.
Primitives for Distributed Communication
3. Blocking primitives ( Send/Receive - waits until its own task completes )
A primitive is blocking if control returns to the invoking process after the processing
for the primitive (whether in synchronous or asynchronous mode) completes.
- Synchronous messaging
- Asynchronous messaging
4. Nonblocking primitives (Send/Receive - Not bothered about its own task )
• A primitive is non-blocking if control returns back to the invoking process
immediately after invocation, even though the operation has not completed.
• Send: even before data copied out of user buffer
• Receive: even before data may have arrived from sender
Primitives for Distributed Communication
• Return parameter on the primitive call returns a system generated Handle
which can be used to check the status of completion of the call.
Checking is done in three ways
1. Polling ( keeps checking in loop or periodically when the handle is posted )
Send(X, destination, handlek) //handlek is a return parameter
2. Wait with handle ( issues a wait with the list of handles )
Wait(handle1,handle2,...,handlek,...,handlem) //Wait always blocks
3. Blocking Wait – wait call blocks until one of the stipulated handle is posted
• Detecting and posting the completion of the processing of the primitive
Primitives for Distributed Communication
• Four versions of Send primitive
•
Primitives for Distributed Communication
• All the versions are illustrated using timing diagram
• Timing diagram has three time lines
1. Process execution
2. User buffer from / to which data is sent / received
3. Kernel / communication subsystem
Primitives for Distributed Communication
(a) Blocking synchronous Send and Blocking synchronous Receive
b) Non Blocking synchronous Send and Non Blocking synchronous Receive
Pi is sender and Pj is receiver
Primitives for Distributed Communication
(a) Blocking synchronous send and Blocking synchronous receive
Send:
Primitives for Distributed Communication
(a) Blocking synchronous send and Blocking synchronous receive
Send:
Primitives for Distributed Communication
c) Blocking Asynchronous Send d) Non Blocking Asynchronous Send
Primitives for Distributed Communication
(a) Blocking synchronous send and Blocking synchronous receive
Send:
Primitives for Distributed Communication
(a) Blocking synchronous send and Blocking synchronous receive
Send:
Primitives for Distributed Communication
Uses :
• A synchronous Send is easier to use from a programmer's perspective because
the handshake between the Send and the Receive makes the communication
appear instantaneous, thereby simplifying the program logic.
1. [[
• The nonblocking synchronous Send also avoids the potentially large delays for
handshaking, particularly when the receiver has not yet issued the Receive call.
• The nonblocking asynchronous Send is useful when a large data item is being
sent because it allows the process to perform other instructions in parallel with
the completion of the Send.
• The nonblocking Receive is useful when a large data item is being received
and/or when the sender has not yet issued the Send call, because it allows the
process to perform other instructions in parallel with the completion of the
Receive.
Primitives for Distributed Communication
Processor Synchrony
• All the processors must have their clock synchronized.
• Not attainable in Distributed system
• Attainable only for larger granularity of code
• This synchronization can be achieved using barrier synchronization
• Barrier synchronization:
Ensures that no processor begins executing the next step of code until all the
processors have completed executing the previous steps of code assigned to
each of the processors.
Primitives for Distributed Communication
Libraries and Standards
[
• In computer systems we use different
Software products and scientific tools for
sending messages and making remote calls
• Big companies use their custom methods
like IBM’s CICS Customer Information
Control System)software.
• Scientists use libraries like MPI and PVM
• Commercial software's uses a method called RPC (used to call functions on different
computers).Types of RPC used – Sun RPC and DCE (Distributed computing Environment ) RPC.
• Sockets are used to make remote calls work
•
Primitives for Distributed Communication
Libraries and Standards
[
• Two other mechanism for communication – messaging (focusses on point to point communication
between specific applications) and streaming ( for broader data dissemination and real time
processing )
• For the object based software RMI (Remote Method Invocation) and ROI (Remote Object Invocation)
are used. And big standardized systems like CORBA and DCOM(Distributed Component Object Model)
are also used.
Synchronous VS Asynchronous Executions
SYNCHRONOUS ASYNCHRONOUS
Processors are synchronized and the clock
drift rates between any two processors is
bounded
There is no processor synchrony and there is
no bound on the drift rate of processor clocks.
Message delivery (transmission + delivery)
times are such that they occur in one logical
step or Round
Message delays (transmission + propagation
times) are finite but unbounded
There is a known upper bound on the time
taken by a process to execute a step.
There is no upper bound on the time taken by
a process to execute a step.
Synchronous VS Asynchronous Executions
Code for synchronous execution
All the messages sent in a round are received within that same round.
(1) Sync_Execution(int k, n)
// There are k rounds.
(2) for r = 1 to k do
(3) process i sends a message to (i + 1) mod n and (i −1) mod n;
(4) each process i receives a message from (i + 1) mod n and (i − 1) mod n;
(5) compute some application-specific function on the received values
[
Synchronous VS Asynchronous Executions
Emulation
A  S (Emulated in such a way that all communication is finished within the
same round in which it is initiated)
S  A (Emulated using a tool called synchronizer)
[
Emulation among the principal system classes in a failure free system
Design Issues and Challenges
1. From system perspective
2. Algorithmic challenges
3. Applications of distributed computing
From system perspective
4. Communication
5. Processes
6. Naming
7. Synchronization
8. Data storage and access
9. Consistency and replication
10. Fault tolerance
11. Security
12. Applications Programming Interface (API) and transparency
13. Scalability and modularity
Design Issues and Challenges
1. Communication :
involves designing appropriate mechanisms for communication among the
processes in the network.
Eg: Remote Procedure Call (RPC), Remote Object Invocation (ROI), message- oriented
communication versus stream oriented communication
2. Processes :
management of processes and threads at clients/servers; code migration; and the design
of software and mobile agents
3. Naming :
Devising easy to use and robust schemes for names, identifiers, and addresses is essential for
locating resources and processes in a transparent and scalable manner. Naming in mobile
systems provides additional challenges because naming cannot easily be tied to any static
geographical topology
Design Issues and Challenges
4. Synchronization :
Mechanisms for synchronization or coordination among the processes are essential.
Eg: Mutual Exclusion and Leader election algorithms are required
Physical and Logical clocks has to be synchronized
Global state recording algorithms require different forms of synchronization
5. Data storage and access :
Proper schemes are required for storing and accessing the data in a fast and
scalable manner across the network in order to provide efficiency.
6. Consistency and replication :
Replication of data objects is highly desirable for fast access of data and
scalability. Dealing with the consistency among the replicas in distributed
setting is another challenge.
Design Issues and Challenges
7. Fault Tolerance :
(a) Fault tolerance requires maintaining correct and efficient operation in spite of
any failures of links, nodes, and processes.
(b) Process resilience, reliable communication, distributed commit, checkpointing
and recovery, agreement and consensus, failure detection, and self-stabilization are
some of the mechanisms to provide fault-tolerance.
8. Security :
involves various aspects of cryptography, secure channels, access control, key
management– generation and distribution, authorization, and secure group
management.
9. Applications Programming Interface and Transparency :.
The API for communication and other specialized services is important for the ease of
use and wider adoption of the distributed systems services by non-technical users.
Design Issues and Challenges
Transparency :
Transparency deals with hiding the implementation policies from the user
(a) Access Transparency : hides differences in data representation on different
systems and providing uniform operations to access
system resources.
(b) Location Transparency : makes the locations of resources transparent to the
users.
(c) Migration Transparency : allows relocating resources without the users
noticing it.
Design Issues and Challenges
(d) Relocation Transparency : The ability to relocate the resources as they are
being accessed
(e) Replication Transparency : does not let the user become aware of any
replication
(f) Concurrency Transparency : deals with masking the concurrent use of shared
resources for the user.
(g) Failure Transparency : refers to the system being reliable and fault-tolerant.
Design Issues and Challenges
10. Scalability and Modularity :
The algorithms, data (objects), and services must be as dis tributed as
possible. Various techniques such as replication, caching and cache
management, and asynchronous processing help to achieve scalability
Algorithmic challenges in Distributed Computing
Distributed Program
• It is composed of a set of n asynchronous processes p1, p2, ..., pi, ..., pn that
communicate by message passing over the communication network.
• Each process runs on a different processor
• Cij denote the channel from process pi to process pj
• mij denote a message sent by pi to pj.
• The processes do not share a global memory and communicate solely by passing
messages.
• processes do not share a global clock
Distributed Program
• Process execution and message transfer are asynchronous ( will not wait for
the delivery of message to complete)
• The global state of a distributed computation is composed of the states of the
processes and the communication channels
- The state of a process is characterized by the state of its local memory
- The state of a channel is characterized by the set of messages in transit
in the channel
Model of Distributed Execution
send(m) →msg rec(m)
• Dot - indicates an event , horizontal line - progress of the process , slant
arrow - indicates message transfer
• The actions of a process are modeled as three types of events:
internal events, message send events, and message receive events.
• Let eix
denote the xth
event at process pi.
• For message m, send(m) and rec(m) denote its send and receive events,
respectively.
Model of Distributed Execution
• The occurrence of events changes the states of respective processes and
channels.
- An internal event changes the state of the process at which it occurs.
- A send event (or a receive event) changes the state of the process that
sends (or receives) the message and the state of the channel on which
the message is sent (or received).
• The events at a process are linearly ordered by their order of occurrence.
Model of Distributed Execution
• The events at a process are linearly ordered by their order of occurrence.
• The execution of process pi produces a sequence of events ei
1
, ei
2
, ..., ei
x
, ei
x+1
and is denoted by Hi
• Hi =(hi, →i)
hi is the set of events produced by pi
binary relation →i defines a linear order on these events.
• Relation →i expresses causal dependencies among the events of pi.
• The execution of an event takes a finite amount of time.
Model of Distributed Execution
Causal Precedence Relation
• For any two events ei and ej , ei → ej  ej → ei.
• For any two events ei and ej , ei → ej  ej → ei.

DC - UNIT 1 - INTRODUCTION FINAL COPY TO STUDY

  • 1.
    CS3551 - DISTRIBUTEDCOMPUTING UNIT I INTRODUCTION 8 Introduction: Definition-Relation to Computer System Components – Motivation – Message -Passing Systems versus Shared Memory Systems – Primitives for Distributed Communication – Synchronous versus Asynchronous Executions – Design Issues and Challenges; A Model of Distributed Computations: A Distributed Program – A Model of Distributed Executions – Models of Communication Networks – Global State of a Distributed System. UNIT II LOGICAL TIME AND GLOBAL STATE 10 Logical Time: Physical Clock Synchronization: NTP – A Framework for a System of Logical Clocks – Scalar Time – Vector Time; Message Ordering and Group Communication: Message Ordering Paradigms -Asynchronous Execution with Synchronous Communication – Synchronous Program Order on Asynchronous System – Group Communication – Causal Order – Total Order; Global State and Snapshot Recording Algorithms: Introduction – System Model and Definitions – Snapshot Algorithms for FIFO Channels. UNIT III DISTRIBUTED MUTEX AND DEADLOCK 10 Distributed Mutual exclusion Algorithms: Introduction – Preliminaries – Lamport’s algorithm – RicartAgrawala’s Algorithm –– Token-Based Algorithms – Suzuki-Kasami’s Broadcast Algorithm; Deadlock Detection in Distributed Systems: Introduction – System Model – Preliminaries – Models of Deadlocks – Chandy-Misra-Haas Algorithm for the AND model and OR Model. UNIT IV CONSENSUS AND RECOVERY 10 Consensus and Agreement Algorithms: Problem Definition – Overview of Results – Agreement in a Failure-Free System(Synchronous and Asynchronous) – Agreement in Synchronous Systems with Failures; Checkpointing and Rollback Recovery: Introduction – Background and Definitions – Issues in Failure Recovery – Checkpoint-based Recovery – Coordinated Checkpointing Algorithm - - Algorithm for Asynchronous Checkpointing and Recovery UNIT V CLOUD COMPUTING 7 Definition of Cloud Computing – Characteristics of Cloud – Cloud Deployment Models – Cloud Service Models – Driving Factors and Challenges of Cloud – Virtualization – Load Balancing – Scalability and Elasticity – Replication – Monitoring – Cloud Services and Platforms: Compute Services – Storage Services – Application Services
  • 2.
    CS3551 - DISTRIBUTEDCOMPUTING TEXT BOOKS 1. Kshemkalyani Ajay D, Mukesh Singhal, “Distributed Computing: Principles, Algorithms and Systems”, Cambridge Press, 2011. 2. Mukesh Singhal, Niranjan G Shivaratri, “Advanced Concepts in Operating systems”, Mc-Graw Hill Publishers, 1994. REFERENCES 1. George Coulouris, Jean Dollimore, Time Kindberg, “Distributed Systems Concepts and Design”, Fifth Edition, Pearson Education, 2012. 2. Pradeep L Sinha, “Distributed Operating Systems: Concepts and Design”, Prentice Hall of India, 2007. 3. Tanenbaum A S, Van Steen M, “Distributed Systems: Principles and Paradigms”, Pearson Education, 2007. 4. Liu M L, “Distributed Computing: Principles and Applications”, Pearson Education, 2004. 5. Nancy A Lynch, “Distributed Algorithms”, Morgan Kaufman Publishers, 2003. 6. Arshdeep Bagga, Vijay Madisetti, “ Cloud Computing: A Hands-On Approach”, Universities Press, 2014. •
  • 3.
    UNIT I –INTRODUCTION
  • 4.
    Introduction  Father :Le Sile B.Lamport  Sister concern of OS  OS and DC  Need for Distributed computing  - Efficiency  - Fault Tolerance  - Scalability  Distributed computing and parallel computing  Distributed computing and Distributed system
  • 5.
    Definition and Characteristics •Definition A collection of computers said to be loosely coupled which has its own memory, runs its own OS and communicates by message passing over a communication network cooperating to address a problem collectively is termed to be Distributed computing. • Characteristics 1. No common physical clock - independent start time 2. No shared memory - Direct or indirect communication through message passing 3. Geographical separation - Resides at any location communicating through networking - NOW / COW ( Network / Cluster of Workstation) 4. Autonomy and heterogeneity - Runs at different speed and on different OS
  • 6.
    Motivation / Benefits 1)Inherentlydistributed computation Distributed computing helps in distributed computation in the case of communication between distant parties eg: money transfer in banking. 2) Resource sharing • Datasets are distributed across several servers eg: Distributed database such as DB2 • Replication also possible 3) Access to geographically remote data and resources • Database - payroll data of an MNC • Supercomputer - To be centralized • Wireless technology
  • 7.
    Motivation / Benefits 4)Enhanced reliability • Availability - Resource must always be available • Integrity - We must not get non relevant information • Fault Tolerant - There must be a back up 5) Increased performance / cost ratio • Overall performance is always better than normal computing with respect to total cost 6) Scalability • Bears with the load when number of users are increased
  • 8.
    Motivation / Benefits 7)Modularity and incremental expandability • Modularity - Configuration can be different in each node • Incremental expandability – Nodes can be replaced or increased
  • 9.
    Applications  Applications 1) Financeand commerce - Amazon 2) Information society - Search engines 3) Cloud technologies - AWS 4) Entertainment - Online gaming, youtube 5) Healthcare - Online patient records 6) Education - E-learning 7) Transport and logistics - Google map 8) Environment management - Sensor technologies
  • 10.
    Unique Drawbacks 1) Communication •Communication does not come for free • Limit on quantity of information sent • Limit on the speed in which information is sent • Cost of transmission is high 2) Incomplete Knowledge • Each processor has only a limited knowledge of the systems and the computations carried out by other processors ie it will not know the size of input shared at other processor
  • 11.
    Relation to computersystem components Distributed system connects processor by communication network • Structure of DC holds various computers or nodes with processor and memory for itself all connected via WAN / LAN. • Communication network helps in information transfer between nodes.
  • 12.
    Relation to computersystem components Interaction of software components at each processor • Network protocol stack - http, mail, ftp and telnet • Middleware - holds libraries required by distributed computing - Egs: OMG - Object management group, CORBA - common object request broker architecture, RPC - Remote procedure call , MPI – Message passing interface
  • 13.
    Relation to computersystem components Interaction of software components at each processor • There is communication between network protocol stack, OS and middleware. • Application layer functions will not be available in middleware • Libraries will perform reliable and ordered multicasting
  • 14.
    Message passing systemversus shared memory system Single processor system Message passing - Common message queue Shared memory - Common memory
  • 15.
    Message passing systemversus shared memory system • Shared memory is used throughout the normal system • Communication takes place through shared data variable and control variable • Semaphores and monitors are used for achieving synchronization in shared memory usage • Distributed computing Message passing called distributed shared memory is used for communication in distributed computing Message passing uses two primitives send (dest_name , message) receive ( source_name , message)
  • 16.
    Emulating Message Passingon Shared Memory system (MP  SP) • Shared address space is divided into multiple parts • Each part is assigned to different processor. • Message passing uses two primitives send (dest_name , message) receive ( source_name , message) • Shared memory uses two primitives : Read and Write • Message passing primitives has to be emulated into shared memory primitives Send is emulated as write ( separate location is reserved into partition as mailbox) Receive is emulated as read • Read and write operation has to be synchronized Shared memory p1 p2 p3
  • 17.
    Emulating Shared Memoryon Message Passing system (SM  MP) • Read is emulated as Receive and Write is emulated as Send • Each shared location is modeled as a separate process (a) Write to a shared location is emulated by sending an update message to the corresponding owner process (b) Read to a shared location is emulated by sending a query message to the corresponding owner process • While using shared memory emulation , latency involved in read and write operation may be high.
  • 18.
    Emulating Shared Memoryon Message Passing system (SM  MP) • An application can use combination of shared memory and message passing. • In MIMD (Multiple Instruction and Multiple Data) Message passing multicomputer system - multiprocessor system is used - Into multiprocessor system processor communicates via shared memory - communication between two computers is done using message passing
  • 19.
    Difference between SharedMemory and Message Passing Shared Memory Message Passing It is mainly used for data communication. It is mainly used for communication. It offers a maximum speed of computation because communication is completed via the shared memory, so the system calls are only required to establish the shared memory. It takes a huge time because it is performed via the kernel (system calls). The code for reading and writing the data from the shared memory should be written explicitly by the developer. No such code is required in this case because the message passing feature offers a method for communication and synchronization of activities executed by the communicating processes.
  • 20.
    Difference between SharedMemory and Message Passing Shared Memory Message Passing It is used to communicate between the single processor and multiprocessor systems in which the processes to be communicated are on the same machine and share the same address space. It is most commonly utilized in a distributed setting when communicating processes are spread over multiple devices linked by a network. It is a faster communication strategy than the message passing. It is a relatively slower communication strategy than the shared memory. Make sure that processes in shared memory aren't writing to the same address simultaneously. It is useful for sharing little quantities of data without causing disputes.
  • 21.
    Primitives for DistributedCommunication • Message sending and receiving primitives : Send() and Receive() • Parameters of Send() primitive : (a) Destination (b) Buffer in user space, containing the data to be sent. • Two ways of sending data when Send() is invoked: (a) Buffered - copies data from user buffer to kernel buffer and then from kernel buffer to network (b) UnBuffered – Data is directly copied from user buffer onto network
  • 22.
    Primitives for DistributedCommunication • Parameters of Receive() primitive : (a) Source from which the data is to be received (b) User buffer into which the data is to be received • ways of receiving data when Receive() is invoked: (a) Buffered – since data may already have arrived when the primitive is invoked and needs a storage place in kernel
  • 23.
    Primitives for DistributedCommunication Types of primitives 1. Synchronous primitives ( Send/Receive - A cares for B and B cares for A ) • Handshake between sender and receiver • Send completes when Receive completes • Receive completes when data copied into buffer [[ 2. Asynchronous primitives ( Send - A cares only about itself ) A Send primitive is said to be asynchronous if control returns back to the invoking process after the data item to be sent has been copied out of the user-specified buffer.
  • 24.
    Primitives for DistributedCommunication 3. Blocking primitives ( Send/Receive - waits until its own task completes ) A primitive is blocking if control returns to the invoking process after the processing for the primitive (whether in synchronous or asynchronous mode) completes. - Synchronous messaging - Asynchronous messaging 4. Nonblocking primitives (Send/Receive - Not bothered about its own task ) • A primitive is non-blocking if control returns back to the invoking process immediately after invocation, even though the operation has not completed. • Send: even before data copied out of user buffer • Receive: even before data may have arrived from sender
  • 25.
    Primitives for DistributedCommunication • Return parameter on the primitive call returns a system generated Handle which can be used to check the status of completion of the call. Checking is done in three ways 1. Polling ( keeps checking in loop or periodically when the handle is posted ) Send(X, destination, handlek) //handlek is a return parameter 2. Wait with handle ( issues a wait with the list of handles ) Wait(handle1,handle2,...,handlek,...,handlem) //Wait always blocks 3. Blocking Wait – wait call blocks until one of the stipulated handle is posted • Detecting and posting the completion of the processing of the primitive
  • 26.
    Primitives for DistributedCommunication • Four versions of Send primitive •
  • 27.
    Primitives for DistributedCommunication • All the versions are illustrated using timing diagram • Timing diagram has three time lines 1. Process execution 2. User buffer from / to which data is sent / received 3. Kernel / communication subsystem
  • 28.
    Primitives for DistributedCommunication (a) Blocking synchronous Send and Blocking synchronous Receive b) Non Blocking synchronous Send and Non Blocking synchronous Receive Pi is sender and Pj is receiver
  • 29.
    Primitives for DistributedCommunication (a) Blocking synchronous send and Blocking synchronous receive Send:
  • 30.
    Primitives for DistributedCommunication (a) Blocking synchronous send and Blocking synchronous receive Send:
  • 31.
    Primitives for DistributedCommunication c) Blocking Asynchronous Send d) Non Blocking Asynchronous Send
  • 32.
    Primitives for DistributedCommunication (a) Blocking synchronous send and Blocking synchronous receive Send:
  • 33.
    Primitives for DistributedCommunication (a) Blocking synchronous send and Blocking synchronous receive Send:
  • 34.
    Primitives for DistributedCommunication Uses : • A synchronous Send is easier to use from a programmer's perspective because the handshake between the Send and the Receive makes the communication appear instantaneous, thereby simplifying the program logic. 1. [[ • The nonblocking synchronous Send also avoids the potentially large delays for handshaking, particularly when the receiver has not yet issued the Receive call. • The nonblocking asynchronous Send is useful when a large data item is being sent because it allows the process to perform other instructions in parallel with the completion of the Send. • The nonblocking Receive is useful when a large data item is being received and/or when the sender has not yet issued the Send call, because it allows the process to perform other instructions in parallel with the completion of the Receive.
  • 35.
    Primitives for DistributedCommunication Processor Synchrony • All the processors must have their clock synchronized. • Not attainable in Distributed system • Attainable only for larger granularity of code • This synchronization can be achieved using barrier synchronization • Barrier synchronization: Ensures that no processor begins executing the next step of code until all the processors have completed executing the previous steps of code assigned to each of the processors.
  • 36.
    Primitives for DistributedCommunication Libraries and Standards [ • In computer systems we use different Software products and scientific tools for sending messages and making remote calls • Big companies use their custom methods like IBM’s CICS Customer Information Control System)software. • Scientists use libraries like MPI and PVM • Commercial software's uses a method called RPC (used to call functions on different computers).Types of RPC used – Sun RPC and DCE (Distributed computing Environment ) RPC. • Sockets are used to make remote calls work •
  • 37.
    Primitives for DistributedCommunication Libraries and Standards [ • Two other mechanism for communication – messaging (focusses on point to point communication between specific applications) and streaming ( for broader data dissemination and real time processing ) • For the object based software RMI (Remote Method Invocation) and ROI (Remote Object Invocation) are used. And big standardized systems like CORBA and DCOM(Distributed Component Object Model) are also used.
  • 38.
    Synchronous VS AsynchronousExecutions SYNCHRONOUS ASYNCHRONOUS Processors are synchronized and the clock drift rates between any two processors is bounded There is no processor synchrony and there is no bound on the drift rate of processor clocks. Message delivery (transmission + delivery) times are such that they occur in one logical step or Round Message delays (transmission + propagation times) are finite but unbounded There is a known upper bound on the time taken by a process to execute a step. There is no upper bound on the time taken by a process to execute a step.
  • 39.
    Synchronous VS AsynchronousExecutions Code for synchronous execution All the messages sent in a round are received within that same round. (1) Sync_Execution(int k, n) // There are k rounds. (2) for r = 1 to k do (3) process i sends a message to (i + 1) mod n and (i −1) mod n; (4) each process i receives a message from (i + 1) mod n and (i − 1) mod n; (5) compute some application-specific function on the received values [
  • 40.
    Synchronous VS AsynchronousExecutions Emulation A  S (Emulated in such a way that all communication is finished within the same round in which it is initiated) S  A (Emulated using a tool called synchronizer) [ Emulation among the principal system classes in a failure free system
  • 41.
    Design Issues andChallenges 1. From system perspective 2. Algorithmic challenges 3. Applications of distributed computing From system perspective 4. Communication 5. Processes 6. Naming 7. Synchronization 8. Data storage and access 9. Consistency and replication 10. Fault tolerance 11. Security 12. Applications Programming Interface (API) and transparency 13. Scalability and modularity
  • 42.
    Design Issues andChallenges 1. Communication : involves designing appropriate mechanisms for communication among the processes in the network. Eg: Remote Procedure Call (RPC), Remote Object Invocation (ROI), message- oriented communication versus stream oriented communication 2. Processes : management of processes and threads at clients/servers; code migration; and the design of software and mobile agents 3. Naming : Devising easy to use and robust schemes for names, identifiers, and addresses is essential for locating resources and processes in a transparent and scalable manner. Naming in mobile systems provides additional challenges because naming cannot easily be tied to any static geographical topology
  • 43.
    Design Issues andChallenges 4. Synchronization : Mechanisms for synchronization or coordination among the processes are essential. Eg: Mutual Exclusion and Leader election algorithms are required Physical and Logical clocks has to be synchronized Global state recording algorithms require different forms of synchronization 5. Data storage and access : Proper schemes are required for storing and accessing the data in a fast and scalable manner across the network in order to provide efficiency. 6. Consistency and replication : Replication of data objects is highly desirable for fast access of data and scalability. Dealing with the consistency among the replicas in distributed setting is another challenge.
  • 44.
    Design Issues andChallenges 7. Fault Tolerance : (a) Fault tolerance requires maintaining correct and efficient operation in spite of any failures of links, nodes, and processes. (b) Process resilience, reliable communication, distributed commit, checkpointing and recovery, agreement and consensus, failure detection, and self-stabilization are some of the mechanisms to provide fault-tolerance. 8. Security : involves various aspects of cryptography, secure channels, access control, key management– generation and distribution, authorization, and secure group management. 9. Applications Programming Interface and Transparency :. The API for communication and other specialized services is important for the ease of use and wider adoption of the distributed systems services by non-technical users.
  • 45.
    Design Issues andChallenges Transparency : Transparency deals with hiding the implementation policies from the user (a) Access Transparency : hides differences in data representation on different systems and providing uniform operations to access system resources. (b) Location Transparency : makes the locations of resources transparent to the users. (c) Migration Transparency : allows relocating resources without the users noticing it.
  • 46.
    Design Issues andChallenges (d) Relocation Transparency : The ability to relocate the resources as they are being accessed (e) Replication Transparency : does not let the user become aware of any replication (f) Concurrency Transparency : deals with masking the concurrent use of shared resources for the user. (g) Failure Transparency : refers to the system being reliable and fault-tolerant.
  • 47.
    Design Issues andChallenges 10. Scalability and Modularity : The algorithms, data (objects), and services must be as dis tributed as possible. Various techniques such as replication, caching and cache management, and asynchronous processing help to achieve scalability Algorithmic challenges in Distributed Computing
  • 48.
    Distributed Program • Itis composed of a set of n asynchronous processes p1, p2, ..., pi, ..., pn that communicate by message passing over the communication network. • Each process runs on a different processor • Cij denote the channel from process pi to process pj • mij denote a message sent by pi to pj. • The processes do not share a global memory and communicate solely by passing messages. • processes do not share a global clock
  • 49.
    Distributed Program • Processexecution and message transfer are asynchronous ( will not wait for the delivery of message to complete) • The global state of a distributed computation is composed of the states of the processes and the communication channels - The state of a process is characterized by the state of its local memory - The state of a channel is characterized by the set of messages in transit in the channel
  • 50.
    Model of DistributedExecution send(m) →msg rec(m) • Dot - indicates an event , horizontal line - progress of the process , slant arrow - indicates message transfer • The actions of a process are modeled as three types of events: internal events, message send events, and message receive events. • Let eix denote the xth event at process pi. • For message m, send(m) and rec(m) denote its send and receive events, respectively.
  • 51.
    Model of DistributedExecution • The occurrence of events changes the states of respective processes and channels. - An internal event changes the state of the process at which it occurs. - A send event (or a receive event) changes the state of the process that sends (or receives) the message and the state of the channel on which the message is sent (or received). • The events at a process are linearly ordered by their order of occurrence.
  • 52.
    Model of DistributedExecution • The events at a process are linearly ordered by their order of occurrence. • The execution of process pi produces a sequence of events ei 1 , ei 2 , ..., ei x , ei x+1 and is denoted by Hi • Hi =(hi, →i) hi is the set of events produced by pi binary relation →i defines a linear order on these events. • Relation →i expresses causal dependencies among the events of pi. • The execution of an event takes a finite amount of time.
  • 53.
    Model of DistributedExecution Causal Precedence Relation • For any two events ei and ej , ei → ej  ej → ei. • For any two events ei and ej , ei → ej  ej → ei.