Chapter One.ppt

1.1 Introduction and Definition
 Definition of a Distributed System
 Characteristics of Distributed System
 Organization and Goals of A Distributed Systems
 Hardware Concept and Software Concept
 The Client-Server model
2

 a distributed system is:
a collection of independent computers that appears to its
users as a single coherent system - computer
(Tanenbaum & Van Steen)
 this definition has two aspects:
1. hardware: autonomous machines
2. software: a single system view for the users
 Distributed system involves a collection of autonomous
computers, they are independent systems that posses
their own memory and CPU
 Distributed system consists of multiple software
Components that are on multiple computers, but runs as a
single system
3

 A distributed system contains multiple nodes that are
physically separated but linked together using network
 The computers that are in distributed system can be
physically close together and connected by LAN
 Or they can be geographically distant and connected by a
WAN
 Distributed computing has become increasingly common
due advances that have made machines and networks
cheaper and faster
 Examples of distributed
 Distributed database
 World wide web
 Email
4

 Distributed system example
 Think about a large bank system with hundreds
of branch offices all over the country. Each office
has a master computer to store local accounts
and handle local transactions .In addition, each
computer has the ability to talk all other branch
computers and with central computer at
headquarter. If transactions can be done without
regarding to where costumer and account is
5

Characteristics of Distributed Systems
 differences between the computers and the ways they
communicate are hidden from users
 users and applications can interact with a distributed
system in a consistent and uniform way regardless of
location
 distributed systems should be easy to expand and scale
 a distributed system is normally continuously available,
even if there may be partial failures
6

1.2 Organization and Goals of a Distributed System
 to support heterogeneous computers and networks and
to provide a single-system view, a distributed system is
often organized by means of a layer of software called
middleware that extends over multiple machines
7

 Goals of a distributed system: a distributed system
should
 easily connect users with resources (printers, computers,
storage facilities, data, files, Web pages, ...)
 reasons: economics, to collaborate and exchange
information
 be transparent: hide the fact that the resources and
processes are distributed across multiple computers
 be open
 be scalable
 Transparency in a Distributed System
 a distributed system that is able to present itself to users
and applications as if it were only a single computer
system is said to be transparent
8

 users and applications see the DS as a single coherent
system
 different forms of transparency in a distributed system
Transparency Description
Access Hide differences in data representation
and how a resource is accessed
Location Hide where a resource is physically located
Migration Hide that a resource may move to another
location
Relocation Hide that a resource may be moved to another
location while in use
Replication Hide that a resource is replicated
Concurrency Hide that a resource may be shared by several
competitive users
Failure Hide the failure and recovery of a resource
9

 Openness in a Distributed System
 a distributed system should be open
 So that different open systems would be able to interact and use
services from each other
 interoperability
 components of different origin can communicate
 Support portability
 components work on different platforms
 We need well-defined interfaces
 such services are often specified through interfaces often described using an
Interface Definition Language (IDL):
 specify only syntax: the names of the functions, types of parameters, return
values
 Distributed system should be independent from heterogeneity of the
underlying environment
 Hardware, Software Platforms, and Languages
 an Open Distributed System is a system that offers services according to
standard rules that describe the syntax and semantics of those services;
e.g., protocols in networks 10

 Scalability in Distributed Systems
 a distributed system should be scalable: there are three
dimensions
 size: adding more users and resources to the system
 geographically: users and resources may be far apart
 administratively: should be easy to manage even if it
spans many administrative organizations
 A distributed system is scalable if it will remain effective
when the number of resources and users is significantly
increased
 but a scalable system may exhibit performance problems
11

 scalability problems: performance problems caused by
limited capacity of servers and networks
 Solution :Simply improving their capacity (e.g., by
increasing memory, upgrading CPUs, or replacing network
modules) is often a solution
 Scaling Techniques
 how to solve scaling problems for geographical scalability
 three possible solutions: hiding communication latencies,
distribution, and replication
12

a. Hide Communication Latencies
 try to avoid waiting for responses to remote service
requests
 let the requester do other useful job
 i.e., construct requesting applications that use only
asynchronous communication instead of synchronous
communication; when a reply arrives the application is
interrupted
 good for batch processing and parallel applications but
not for interactive applications
 for interactive applications, move part of the job to the
client to reduce communication; e.g. filling a form and
checking the entries
13

 e.g., shipping code is now supported in Web applications using Java
Applets
(a) a server checking the correctness of field entries
(b) a client doing the job
14

b. Distribution
 Taking a component, splitting into smaller parts, and
subsequently spreading them across the system. (E.g.
Domain Name System)
 There are multiple name servers that map symbolic
name(hostname) to IP
 In a URL, the part between the // and the following / is the
hostname of the server to which the client is going to send
the request
 for details, see later in Chapter 4 - Naming
an example of dividing the DNS name space into zones
15

c. Replication
 replicate components across a distributed system to
increase availability and for load balancing, leading
to better performance
 that makes multiple copies of the same services or
data available at different machines
 By placing a replica close to the place where it is
accessed, also reduces communication latency
 caching (a special form of replication)
 but, caching and replication may lead to consistency
problems (see Chapter 6 - Consistency and
Replication)
16

1.3 Hardware and Software Concepts
 Hardware Concepts
different classification schemes exist
 multiprocessors - with shared memory
 multicomputer - that do not share memory
 can be homogeneous or heterogeneous
17

 a single
backbone
different basic organizations of processors and memories in distributed
systems 18

a bus-based multiprocessor
 Multiprocessors - Shared Memory
 the shared memory has to be coherent - the same value
written by one processor must be read by another
processor
 performance problem for bus-based organization since the
bus will be overloaded as the number of processors
increases
 the solution is to add a high-speed cache memory between
the processors and the bus to hold the most recently
accessed words; may result in incoherent memory
 bus-based multiprocessors are difficult to scale even with
caches
 two possible solutions: crossbar switch and omega
network
19

 Crossbar switch
 divide memory into modules and connect them to the
processors with a crossbar switch
 at every intersection, a crosspoint switch is opened and
closed to establish connection
 problem: expensive; with n CPUs and n memories, n2
switches are required
20

 Omega network
 use switches with multiple input and output lines
 drawback: high latency because of several switching
stages between the CPU and memory
21

 Homogeneous Multicomputer Systems
 also referred to as System Area Networks (SANs)
 could be bus-based or switch-based
 bus-based
 shared multi access network such as Fast Ethernet can
be used and messages are broadcasted
 performance drops highly with more than 25-100 nodes
22

Grid
Hypercube
 switch-based
 messages are routed through an interconnection network
 two popular topologies: meshes (or grids) and
hypercubes
23

 Heterogeneous Multicomputer Systems
 most distributed systems are built on heterogeneous
multicomputer systems
 the computers could be different in processor type,
memory size, architecture, power, operating system, etc.
and the interconnection network may be highly
heterogeneous as well
 the distributed system provides a software layer to hide
the heterogeneity at the hardware level; i.e., provides
transparency
24

Software Concepts
 OSs in relation to distributed systems
distributed OSs (DOS)
 network OSs (NOS)
 Middleware
25

 Distributed Operating Systems
 OS essentially tries to maintain a single, global view of the
resources it manages (Tightly-coupled OS)
 used for multiprocessors and homogeneous multi
computers
 Full transparency: users feel they are interacting with a big
system and are not aware of the existence of multiple
machines
26
general structure of a multicomputer operating system

 Network Operating Systems(loosely coupled OS)
 a collection of computers each running its own OS; they work together to
make their services and resources available to others via network
 possibly heterogeneous underlying hardware
 No transparency: users are aware of the multiplicity of the machines
 explicitly login into remote machines, or copy files from other
machines
 Access to remote services similar to local resources
general structure of a network operating system
27

 Middleware
 Most modern distributed systems are designed to provide a
level of transparency through a software layer on top of local
OSs
This software layer is called Middleware
28
general structure of a distributed system as middleware

 Middleware
 Middleware hides the differences between various
computers and the ways in which they communicate
It provides a single-system view
As a result, middleware facilitates the
integration and interaction of various
networked applications in a consistent and
uniform manner
29

different middleware models exist
 through Remote Procedure Calls (RPCs) - calling a procedure
on a remote machine
 distributed object invocation
 Message-oriented middleware
 (details later in Chapter 2 - Communication)
 middleware services
 access transparency: by hiding the low-level message
passing(calling a procedure or invoking an object
remotely)
 Naming : such as a URL in the WWW
 Distributed transactions: by allowing multiple read and
write operations to occur atomically(TPM)
 Security: middleware authenticate access to data and
services 30

general interaction between a client and a server
1.4 The Client-Server Model
 how are processes organized in distributed system
 thinking in terms of clients requesting services from
servers
 A server is a process implementing a specific service(file,
database server
 A Client is a process that requests a service from server
and subsequently waiting for the server’s reply
31

1.4 The Client-Server Model
 Client-Server Architectures
 how to physically distribute a client-server application
across several machines:
 Two-Tired architecture
Physically distribute a client‐server application
across two machines:
1. A client machine containing only the programs
implementing (part of) the user-interface level
2. A server machine containing the rest, that is the
programs implementing the processing and data
level
 Everything is handled by the server while the client is
essentially no more than dump terminal, possibly with a
pretty graphical interface 32

Two-tiered architecture: alternative client-server organizations
a) Place only terminal-dependent part of the user interface on
the client machine
b) place the entire user-interface software on the client side
c) move part of the application to the client, e.g. checking
correctness in filling forms
d) and e) are for powerful client machines
33

three tiered Architectures
 Many client‐server applications are organized into three
layers
the user-interface level: consists of the program that allows end
users to interact with application; usually through GUIs, but not
necessarily
the processing level: contains the core functionality of the
application
the data level: contains the actual data that a client wants to
manipulate through the application
34
three tiered architecture: an example of a server acting as a client

 the general organization of an Internet search engine into three
different layers
35
browser acts as an entry point to a site, passing requests to an
application server where the actual processing takes place, this
application server, in tum, interacts with a database server

an example of horizontal distribution of a Web service
 Modern Architectures
 vertical distribution: placing logically different components
on different machines
 Dividing applications into a user-interface , processing
component and data and distribute across multiple
machines
 horizontal distribution: physically split up the client or the
server into logically equivalent parts. e.g. Web server
36

Distributed Computing Systems: Cluster
 Many distributed systems are conﬁgured for High-
Performance Computing
 Cluster computing: a group of high-end systems
connected through a LAN
• Homogeneous: same OS, near-identical hardware
• Single managing node
37

Distributed Computing Systems: Grid
 Grid computing: lots of nodes from everywhere
• Heterogeneous
•Dispersed across several organizations
•Can easily span a wide-area network
 To allow for collaborations, grids generally use virtual
organizations.
• A group of users that will allow for authorization on
resource allocation
38

Distributed Computing Systems: Cloud
 Cloud computing: make a distinction between four layers.
 Hardware: processors, routers, power and cooling
systems. Customers normally never get to see these.
 Infrastructure: deploys virtualization techniques. Evolves
around allocating and managing virtual storage devices
and virtual servers. (IaaS)
 Platform: provides higher-level abstractions for storage
and such. (PaaS)
 Application: actual applications, such as oﬃce suites, e.g.,
text processors, spreadsheet applications. (SaaS)
39

Distributed Computing Systems: Cloud
40

Chapter One.ppt

More Related Content

What's hot

Similar to Chapter One.ppt

Recently uploaded

Chapter One.ppt

Editor's Notes