Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Distributed Systems - An introduction <ul><li>Introduction, </li></ul><ul><li>Features of Distributed Systems, </li></ul><ul><li>Naming, </li></ul><ul><li>Operating Systems, </li></ul><ul><li>Distributed Shared Memory, </li></ul><ul><li>Other Issues. </li></ul>
  2. 2. Introduction <ul><li>Computer systems developed from: </li></ul><ul><ul><li>Standalone machines, </li></ul></ul><ul><ul><li>to direct communications between two machines, </li></ul></ul><ul><ul><li>to networks: one machine can communicate with any other networked machine. </li></ul></ul><ul><li>BUT: </li></ul><ul><ul><li>In all cases the user is always aware of the connection between machines, </li></ul></ul><ul><ul><li>Must issue explicit commands for the movement of data. </li></ul></ul>
  3. 3. Distribution(1) <ul><li>Distributed Systems (DS) build on the networking layer: </li></ul><ul><ul><li>Groups of independent machines acting together as one </li></ul></ul><ul><ul><ul><li>Cooperate on a task, not just data sharing, </li></ul></ul></ul><ul><ul><ul><li>Distribute computation among several physical machines. </li></ul></ul></ul><ul><li>Distribution should be transparent to user AND programs at system call interface: </li></ul><ul><ul><li>User or programmer should not be able to tell that a remote machine is involved, </li></ul></ul><ul><ul><li>IDEALLY DS should look like a conventional system to users: </li></ul></ul><ul><ul><ul><li>Appear to users as a single computer . </li></ul></ul></ul>
  4. 4. Distribution (2) <ul><li>Simplest possible distributed system architecture: </li></ul><ul><ul><li>Some processes provide services to others, known as servers, </li></ul></ul><ul><ul><li>Processes which use those services are called clients. </li></ul></ul><ul><li>This approach known as client/server system. </li></ul><ul><li>What the sever does and any data it returns to the client varies depending on system requirements, </li></ul><ul><li>ALL different models of distributed computing can be reduced to this simple model. </li></ul>
  5. 5. Features of Distributed Systems(1) <ul><li>DS have features not found in standalone systems: </li></ul><ul><ul><li>Economy and Performance </li></ul></ul><ul><ul><ul><li>Price/performance ratio favours multiple small machines: </li></ul></ul></ul><ul><ul><ul><ul><li>Especially commodity components, </li></ul></ul></ul></ul><ul><ul><ul><ul><li>See Top500 list in later slide, </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Cumulative performance of micro computers at fraction of the cost of a main frame. </li></ul></ul></ul></ul><ul><ul><ul><li>Utilise spare CPU cycles by dynamically using idle workstations. How many PCs in the Universities Labs? In effect free processing power!!! </li></ul></ul></ul><ul><ul><ul><li>Adapt to increased load and not collapse, </li></ul></ul></ul><ul><ul><ul><li>BUT more processors = more communications?: ISSUES HERE?! </li></ul></ul></ul>
  6. 6. Features of Distributed Systems (2) <ul><ul><li>Reliability: </li></ul></ul><ul><ul><ul><li>High reliability and fault tolerance (through redundancy, user need never know a problem occurred!). </li></ul></ul></ul><ul><ul><li>Resource Sharing: </li></ul></ul><ul><ul><ul><li>Economic: Share an expensive device (i.e Radio Telescope) </li></ul></ul></ul><ul><ul><ul><li>Convenience: not convenient to share a company’s database via floppy disk: distribution and data update problems!!! </li></ul></ul></ul><ul><ul><li>Incremental Growth: </li></ul></ul><ul><ul><ul><li>Not necessary to buy all processing power, memory, storage all at one time, </li></ul></ul></ul><ul><ul><ul><li>System expands to keep pace with growing demand. </li></ul></ul></ul>
  7. 7. Example: Top 500 List (1) <ul><li>Example of price/performance mentioned earlier. </li></ul><ul><li>Clusters are a type of distributed system. </li></ul><ul><li>Top500 List: lists sites operating the 500 most powerful computer systems ( http://www.top500.org/ ): </li></ul><ul><ul><li>Compiled twice yearly. </li></ul></ul><ul><ul><li>Entry to top 10 positions requires > 9.8 TFlop/s (trillions of calculations per second). </li></ul></ul><ul><ul><li>#1: BlueGene/L DD2 beta, IBM/DOE, USA, 70.72 Tflop/s: </li></ul></ul><ul><ul><ul><li>32,768 0.7GHz PowerPC 440 CPUs, </li></ul></ul></ul><ul><ul><li>#2: SGI Altix, Voltaire Infiniband, NASA, USA, 51.87 Tflop/s: </li></ul></ul><ul><ul><ul><li>10,160 1.5 GHz SGI Altix CPUs, </li></ul></ul></ul><ul><ul><li>#3: Earth Simulator, Japan (Climate Modelling): 35.86 TFlop/s </li></ul></ul><ul><ul><ul><li>5,120 500 MHz NEC Vector CPUs, (640 8 CPU nodes), 10 TB memory, 16GB/s inter node bandwith, </li></ul></ul></ul><ul><ul><ul><li>Was #1 from 6/2002 until 11/2004 when BlueGene/L DD2 took over. </li></ul></ul></ul>
  8. 8. Example: Top 500 List (2) <ul><li>Highest ranking Intel cluster (at the moment): </li></ul><ul><ul><li>#10: NCSA, Urbana-Champaign, USA, </li></ul></ul><ul><ul><li>9.81 Tflop/s, </li></ul></ul><ul><ul><li>PowerEdge 1750, </li></ul></ul><ul><ul><li>2500 P4 Xeon 3.06 GHz, </li></ul></ul><ul><ul><li>Myrinet interconnect. </li></ul></ul><ul><li>Commodity hardware: </li></ul><ul><ul><li>The same that you might buy for your own machines (Intel Xeon), </li></ul></ul><ul><ul><li>Myrinet probably not sitting on the shelf at your local PC World! </li></ul></ul><ul><li>Economies of scale (COTs production): </li></ul><ul><ul><li>Price/performance gains. </li></ul></ul><ul><li>Operating System issues: How to coordinate various distributed components into a coherent system?! </li></ul>
  9. 9. A Top 500 contender?
  10. 10. Naming <ul><li>Need to uniquely identify all resources </li></ul><ul><ul><li>Individual machines, processes, files, printers, etc, </li></ul></ul><ul><ul><li>At system level identified by binary numbers, </li></ul></ul><ul><ul><li>Client needs to know identity of server machine AND identifier of process providing a service: </li></ul></ul><ul><ul><ul><li>If server crashes and restarts: process may have a different process ID. Client unable to reach it. </li></ul></ul></ul><ul><ul><ul><li>If server machine crashes, another machine may take over servicing client requests. BUT client continues sending requests to old server’s binary address… </li></ul></ul></ul>
  11. 11. Name server <ul><li>At human level provide meaningful resource names: </li></ul><ul><ul><li>i.e. LaserPrinter1 instead of or the binary network address representation. </li></ul></ul><ul><li>At machine level still identified by binary numbers, </li></ul><ul><li>Binding: The link between name and number, </li></ul><ul><li>Name server process: </li></ul><ul><ul><li>Maintains a database of bindings, </li></ul></ul><ul><ul><li>Translates names to binary (or IP) addresses for the client, </li></ul></ul><ul><ul><li>If binary identifiers change then name server only needs updating, </li></ul></ul><ul><ul><li>If a client can locate the name server, it can locate any other resources in the system. </li></ul></ul><ul><li>A single name can be used to reference a number of servers: Server fail-over, load-balancing, etc… </li></ul>
  12. 12. Operating Systems <ul><li>Two different types of underlying operating system have developed for use in distributed environments: </li></ul><ul><ul><li>Network Operating System (NOS), </li></ul></ul><ul><ul><li>Distributed Operating System. </li></ul></ul>
  13. 13. Network Operating System <ul><li>Attach file systems from a remote server onto a local machine: </li></ul><ul><ul><li>Remote file system appears part of the local directory structure, </li></ul></ul><ul><ul><li>User sees no difference between local and remote files. </li></ul></ul><ul><li>NOS only transfers portions (blocks) of a file that are actually in use, </li></ul><ul><li>If a file is modified: changes are written back to the server, </li></ul><ul><li>Similarity with paging, </li></ul><ul><li>NOS allows other resources like printers to appear local to the client. </li></ul>
  14. 14. Distributed Operating System <ul><li>True distributed OS must (at very least) begin to blur the boundaries between machines, </li></ul><ul><li>Still responsible for managing local resources: </li></ul><ul><ul><li>CPU, network interface, etc. </li></ul></ul><ul><li>But also responsible for: </li></ul><ul><ul><li>Advertising resources to clients, </li></ul></ul><ul><ul><li>Export/import/schedule processes to/from other machines. </li></ul></ul><ul><li>Should do all this TRANSPARENTLY! </li></ul><ul><li>Remote Procedure Calls (RPC) provide a way to transmit data between processes on different machines transparently: </li></ul><ul><ul><li>Hides underlying socket communications. </li></ul></ul>
  15. 15. Hybrid Systems <ul><li>Fully distributed OS not yet in wide spread use. </li></ul><ul><li>Usually single machine OS specially adapted: </li></ul><ul><ul><li>BUT not fully transparent. </li></ul></ul><ul><li>A number of approaches proposed as basis for future distributed operating systems </li></ul><ul><ul><li>I.e. CORBA, Distributed Computing Environment </li></ul></ul><ul><ul><li>But what about other middleware: Jini, Jxta, Web Services? </li></ul></ul>
  16. 16. Distributed Shared Memory <ul><li>Allow memory to be shared by processes on different machines. </li></ul><ul><li>Allows a shared memory programming model to be used by cooperating processes in distributed systems: </li></ul><ul><ul><li>Using this model a standalone system could be distributed with minimum effort. </li></ul></ul><ul><li>Transparent to the programmer: </li></ul><ul><ul><li>Location of processes not relevant to programmer, </li></ul></ul><ul><ul><li>Can be on local or remote machines. </li></ul></ul>
  17. 17. Some Issues (1)… <ul><li>Deadlock, and mutual exclusion, etc become more complicated in a distributed environment! </li></ul><ul><li>Distributed File systems: </li></ul><ul><ul><li>Issues: data availability, transparency, caching, replication, consistency, .. </li></ul></ul><ul><li>Failure recovery: </li></ul><ul><ul><li>Underlying network, servers, processes may fail when least expected: redundancy, replication </li></ul></ul><ul><ul><li>How to keep the system operating to maximum effect in the face of adverse conditions? </li></ul></ul><ul><ul><li>Check pointing of computations? </li></ul></ul><ul><ul><li>Transactions must always be in a known state. How to ensure this? </li></ul></ul><ul><li>Scalability </li></ul><ul><ul><li>Does performance continue to grow when more machines are added or is there a saturation point? How do we overcome this? </li></ul></ul>
  18. 18. Summary <ul><li>A very brief look at some high-level issues associated with distributed (operating) systems. </li></ul><ul><li>Main concepts: </li></ul><ul><ul><li>Price, performance,availability, redundancy, reliability, sharing, transparency, scalability. </li></ul></ul>