Wk6a
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
553
On Slideshare
553
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
17
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Distributed Systems - An introduction
    • Introduction,
    • Features of Distributed Systems,
    • Naming,
    • Operating Systems,
    • Distributed Shared Memory,
    • Other Issues.
  • 2. Introduction
    • Computer systems developed from:
      • Standalone machines,
      • to direct communications between two machines,
      • to networks: one machine can communicate with any other networked machine.
    • BUT:
      • In all cases the user is always aware of the connection between machines,
      • Must issue explicit commands for the movement of data.
  • 3. Distribution(1)
    • Distributed Systems (DS) build on the networking layer:
      • Groups of independent machines acting together as one
        • Cooperate on a task, not just data sharing,
        • Distribute computation among several physical machines.
    • Distribution should be transparent to user AND programs at system call interface:
      • User or programmer should not be able to tell that a remote machine is involved,
      • IDEALLY DS should look like a conventional system to users:
        • Appear to users as a single computer .
  • 4. Distribution (2)
    • Simplest possible distributed system architecture:
      • Some processes provide services to others, known as servers,
      • Processes which use those services are called clients.
    • This approach known as client/server system.
    • What the sever does and any data it returns to the client varies depending on system requirements,
    • ALL different models of distributed computing can be reduced to this simple model.
  • 5. Features of Distributed Systems(1)
    • DS have features not found in standalone systems:
      • Economy and Performance
        • Price/performance ratio favours multiple small machines:
          • Especially commodity components,
          • See Top500 list in later slide,
          • Cumulative performance of micro computers at fraction of the cost of a main frame.
        • Utilise spare CPU cycles by dynamically using idle workstations. How many PCs in the Universities Labs? In effect free processing power!!!
        • Adapt to increased load and not collapse,
        • BUT more processors = more communications?: ISSUES HERE?!
  • 6. Features of Distributed Systems (2)
      • Reliability:
        • High reliability and fault tolerance (through redundancy, user need never know a problem occurred!).
      • Resource Sharing:
        • Economic: Share an expensive device (i.e Radio Telescope)
        • Convenience: not convenient to share a company’s database via floppy disk: distribution and data update problems!!!
      • Incremental Growth:
        • Not necessary to buy all processing power, memory, storage all at one time,
        • System expands to keep pace with growing demand.
  • 7. Example: Top 500 List (1)
    • Example of price/performance mentioned earlier.
    • Clusters are a type of distributed system.
    • Top500 List: lists sites operating the 500 most powerful computer systems ( http://www.top500.org/ ):
      • Compiled twice yearly.
      • Entry to top 10 positions requires > 9.8 TFlop/s (trillions of calculations per second).
      • #1: BlueGene/L DD2 beta, IBM/DOE, USA, 70.72 Tflop/s:
        • 32,768 0.7GHz PowerPC 440 CPUs,
      • #2: SGI Altix, Voltaire Infiniband, NASA, USA, 51.87 Tflop/s:
        • 10,160 1.5 GHz SGI Altix CPUs,
      • #3: Earth Simulator, Japan (Climate Modelling): 35.86 TFlop/s
        • 5,120 500 MHz NEC Vector CPUs, (640 8 CPU nodes), 10 TB memory, 16GB/s inter node bandwith,
        • Was #1 from 6/2002 until 11/2004 when BlueGene/L DD2 took over.
  • 8. Example: Top 500 List (2)
    • Highest ranking Intel cluster (at the moment):
      • #10: NCSA, Urbana-Champaign, USA,
      • 9.81 Tflop/s,
      • PowerEdge 1750,
      • 2500 P4 Xeon 3.06 GHz,
      • Myrinet interconnect.
    • Commodity hardware:
      • The same that you might buy for your own machines (Intel Xeon),
      • Myrinet probably not sitting on the shelf at your local PC World!
    • Economies of scale (COTs production):
      • Price/performance gains.
    • Operating System issues: How to coordinate various distributed components into a coherent system?!
  • 9. A Top 500 contender?
  • 10. Naming
    • Need to uniquely identify all resources
      • Individual machines, processes, files, printers, etc,
      • At system level identified by binary numbers,
      • Client needs to know identity of server machine AND identifier of process providing a service:
        • If server crashes and restarts: process may have a different process ID. Client unable to reach it.
        • If server machine crashes, another machine may take over servicing client requests. BUT client continues sending requests to old server’s binary address…
  • 11. Name server
    • At human level provide meaningful resource names:
      • i.e. LaserPrinter1 instead of 192.168.1.7 or the binary network address representation.
    • At machine level still identified by binary numbers,
    • Binding: The link between name and number,
    • Name server process:
      • Maintains a database of bindings,
      • Translates names to binary (or IP) addresses for the client,
      • If binary identifiers change then name server only needs updating,
      • If a client can locate the name server, it can locate any other resources in the system.
    • A single name can be used to reference a number of servers: Server fail-over, load-balancing, etc…
  • 12. Operating Systems
    • Two different types of underlying operating system have developed for use in distributed environments:
      • Network Operating System (NOS),
      • Distributed Operating System.
  • 13. Network Operating System
    • Attach file systems from a remote server onto a local machine:
      • Remote file system appears part of the local directory structure,
      • User sees no difference between local and remote files.
    • NOS only transfers portions (blocks) of a file that are actually in use,
    • If a file is modified: changes are written back to the server,
    • Similarity with paging,
    • NOS allows other resources like printers to appear local to the client.
  • 14. Distributed Operating System
    • True distributed OS must (at very least) begin to blur the boundaries between machines,
    • Still responsible for managing local resources:
      • CPU, network interface, etc.
    • But also responsible for:
      • Advertising resources to clients,
      • Export/import/schedule processes to/from other machines.
    • Should do all this TRANSPARENTLY!
    • Remote Procedure Calls (RPC) provide a way to transmit data between processes on different machines transparently:
      • Hides underlying socket communications.
  • 15. Hybrid Systems
    • Fully distributed OS not yet in wide spread use.
    • Usually single machine OS specially adapted:
      • BUT not fully transparent.
    • A number of approaches proposed as basis for future distributed operating systems
      • I.e. CORBA, Distributed Computing Environment
      • But what about other middleware: Jini, Jxta, Web Services?
  • 16. Distributed Shared Memory
    • Allow memory to be shared by processes on different machines.
    • Allows a shared memory programming model to be used by cooperating processes in distributed systems:
      • Using this model a standalone system could be distributed with minimum effort.
    • Transparent to the programmer:
      • Location of processes not relevant to programmer,
      • Can be on local or remote machines.
  • 17. Some Issues (1)…
    • Deadlock, and mutual exclusion, etc become more complicated in a distributed environment!
    • Distributed File systems:
      • Issues: data availability, transparency, caching, replication, consistency, ..
    • Failure recovery:
      • Underlying network, servers, processes may fail when least expected: redundancy, replication
      • How to keep the system operating to maximum effect in the face of adverse conditions?
      • Check pointing of computations?
      • Transactions must always be in a known state. How to ensure this?
    • Scalability
      • Does performance continue to grow when more machines are added or is there a saturation point? How do we overcome this?
  • 18. Summary
    • A very brief look at some high-level issues associated with distributed (operating) systems.
    • Main concepts:
      • Price, performance,availability, redundancy, reliability, sharing, transparency, scalability.