Upcoming SlideShare
Loading in...5







Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Wk6a Wk6a Presentation Transcript

    • Distributed Systems - An introduction
      • Introduction,
      • Features of Distributed Systems,
      • Naming,
      • Operating Systems,
      • Distributed Shared Memory,
      • Other Issues.
    • Introduction
      • Computer systems developed from:
        • Standalone machines,
        • to direct communications between two machines,
        • to networks: one machine can communicate with any other networked machine.
      • BUT:
        • In all cases the user is always aware of the connection between machines,
        • Must issue explicit commands for the movement of data.
    • Distribution(1)
      • Distributed Systems (DS) build on the networking layer:
        • Groups of independent machines acting together as one
          • Cooperate on a task, not just data sharing,
          • Distribute computation among several physical machines.
      • Distribution should be transparent to user AND programs at system call interface:
        • User or programmer should not be able to tell that a remote machine is involved,
        • IDEALLY DS should look like a conventional system to users:
          • Appear to users as a single computer .
    • Distribution (2)
      • Simplest possible distributed system architecture:
        • Some processes provide services to others, known as servers,
        • Processes which use those services are called clients.
      • This approach known as client/server system.
      • What the sever does and any data it returns to the client varies depending on system requirements,
      • ALL different models of distributed computing can be reduced to this simple model.
    • Features of Distributed Systems(1)
      • DS have features not found in standalone systems:
        • Economy and Performance
          • Price/performance ratio favours multiple small machines:
            • Especially commodity components,
            • See Top500 list in later slide,
            • Cumulative performance of micro computers at fraction of the cost of a main frame.
          • Utilise spare CPU cycles by dynamically using idle workstations. How many PCs in the Universities Labs? In effect free processing power!!!
          • Adapt to increased load and not collapse,
          • BUT more processors = more communications?: ISSUES HERE?!
    • Features of Distributed Systems (2)
        • Reliability:
          • High reliability and fault tolerance (through redundancy, user need never know a problem occurred!).
        • Resource Sharing:
          • Economic: Share an expensive device (i.e Radio Telescope)
          • Convenience: not convenient to share a company’s database via floppy disk: distribution and data update problems!!!
        • Incremental Growth:
          • Not necessary to buy all processing power, memory, storage all at one time,
          • System expands to keep pace with growing demand.
    • Example: Top 500 List (1)
      • Example of price/performance mentioned earlier.
      • Clusters are a type of distributed system.
      • Top500 List: lists sites operating the 500 most powerful computer systems ( http://www.top500.org/ ):
        • Compiled twice yearly.
        • Entry to top 10 positions requires > 9.8 TFlop/s (trillions of calculations per second).
        • #1: BlueGene/L DD2 beta, IBM/DOE, USA, 70.72 Tflop/s:
          • 32,768 0.7GHz PowerPC 440 CPUs,
        • #2: SGI Altix, Voltaire Infiniband, NASA, USA, 51.87 Tflop/s:
          • 10,160 1.5 GHz SGI Altix CPUs,
        • #3: Earth Simulator, Japan (Climate Modelling): 35.86 TFlop/s
          • 5,120 500 MHz NEC Vector CPUs, (640 8 CPU nodes), 10 TB memory, 16GB/s inter node bandwith,
          • Was #1 from 6/2002 until 11/2004 when BlueGene/L DD2 took over.
    • Example: Top 500 List (2)
      • Highest ranking Intel cluster (at the moment):
        • #10: NCSA, Urbana-Champaign, USA,
        • 9.81 Tflop/s,
        • PowerEdge 1750,
        • 2500 P4 Xeon 3.06 GHz,
        • Myrinet interconnect.
      • Commodity hardware:
        • The same that you might buy for your own machines (Intel Xeon),
        • Myrinet probably not sitting on the shelf at your local PC World!
      • Economies of scale (COTs production):
        • Price/performance gains.
      • Operating System issues: How to coordinate various distributed components into a coherent system?!
    • A Top 500 contender?
    • Naming
      • Need to uniquely identify all resources
        • Individual machines, processes, files, printers, etc,
        • At system level identified by binary numbers,
        • Client needs to know identity of server machine AND identifier of process providing a service:
          • If server crashes and restarts: process may have a different process ID. Client unable to reach it.
          • If server machine crashes, another machine may take over servicing client requests. BUT client continues sending requests to old server’s binary address…
    • Name server
      • At human level provide meaningful resource names:
        • i.e. LaserPrinter1 instead of or the binary network address representation.
      • At machine level still identified by binary numbers,
      • Binding: The link between name and number,
      • Name server process:
        • Maintains a database of bindings,
        • Translates names to binary (or IP) addresses for the client,
        • If binary identifiers change then name server only needs updating,
        • If a client can locate the name server, it can locate any other resources in the system.
      • A single name can be used to reference a number of servers: Server fail-over, load-balancing, etc…
    • Operating Systems
      • Two different types of underlying operating system have developed for use in distributed environments:
        • Network Operating System (NOS),
        • Distributed Operating System.
    • Network Operating System
      • Attach file systems from a remote server onto a local machine:
        • Remote file system appears part of the local directory structure,
        • User sees no difference between local and remote files.
      • NOS only transfers portions (blocks) of a file that are actually in use,
      • If a file is modified: changes are written back to the server,
      • Similarity with paging,
      • NOS allows other resources like printers to appear local to the client.
    • Distributed Operating System
      • True distributed OS must (at very least) begin to blur the boundaries between machines,
      • Still responsible for managing local resources:
        • CPU, network interface, etc.
      • But also responsible for:
        • Advertising resources to clients,
        • Export/import/schedule processes to/from other machines.
      • Should do all this TRANSPARENTLY!
      • Remote Procedure Calls (RPC) provide a way to transmit data between processes on different machines transparently:
        • Hides underlying socket communications.
    • Hybrid Systems
      • Fully distributed OS not yet in wide spread use.
      • Usually single machine OS specially adapted:
        • BUT not fully transparent.
      • A number of approaches proposed as basis for future distributed operating systems
        • I.e. CORBA, Distributed Computing Environment
        • But what about other middleware: Jini, Jxta, Web Services?
    • Distributed Shared Memory
      • Allow memory to be shared by processes on different machines.
      • Allows a shared memory programming model to be used by cooperating processes in distributed systems:
        • Using this model a standalone system could be distributed with minimum effort.
      • Transparent to the programmer:
        • Location of processes not relevant to programmer,
        • Can be on local or remote machines.
    • Some Issues (1)…
      • Deadlock, and mutual exclusion, etc become more complicated in a distributed environment!
      • Distributed File systems:
        • Issues: data availability, transparency, caching, replication, consistency, ..
      • Failure recovery:
        • Underlying network, servers, processes may fail when least expected: redundancy, replication
        • How to keep the system operating to maximum effect in the face of adverse conditions?
        • Check pointing of computations?
        • Transactions must always be in a known state. How to ensure this?
      • Scalability
        • Does performance continue to grow when more machines are added or is there a saturation point? How do we overcome this?
    • Summary
      • A very brief look at some high-level issues associated with distributed (operating) systems.
      • Main concepts:
        • Price, performance,availability, redundancy, reliability, sharing, transparency, scalability.