An Operating System for Multicore and Clouds: Mechanisms and Implementation


Published on

Introduction to FOS, An Operating System for Multicore and Clouds: Mechanisms and Implementation

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

An Operating System for Multicore and Clouds: Mechanisms and Implementation

  1. 1. An Operating System for ulticore and Clouds: Mechanism and Implementation D. Wentzlaff, C. Gruenwald III, N. Beckmann, K. Modzelewski, A. Belay, L. Youseff, J. Miller, A. Agarwal CSAIL MIT Jiannan Ouyang Ph.D. Student
  2. 2. Outline • Introduction • Multicore and Cloud Operating System Challenges • Architecture • Case Studies • Implementation and ResultsJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  3. 3. Introduction • Multicore and cloud computers need new operating systems – Traditional OS doesn’t scale well – Current IaaS systems require users to explicitly manage resources and machine boundaries • fractured and non-uniform view of resources to the programmer • user must often build or buy server load balancers for scheduling across machinesJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  4. 4. Challenges of Multicore & Cloud operating systems Problems OS designers need to address in the next decade • Scalability • Variability of Demand • Faults • Programming ChallengesJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  5. 5. Scalability • Current OSes were designed for single processor or small number of processor systems. • Manycore computer system – Limitations of locks, locality aliasing, reliance on shared memory • Data center with thousands of serversJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  6. 6. Variability of Demand • Resource of a manycore system: number of cores being used. – Map processes to live cores • Elasticity of the cloudJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  7. 7. Faults • Hardware faults: – dying cores and bit flips. • Performance interference – Interference between Apps and VMs impact the QoS • Software faults – Parallel programming, debugging is hardJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  8. 8. Programming Challenges • Resource management must be done by the cloud application • Load balancing is hard in the cloud • no uniform programming model – Intra/Inter-machine communicationJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  9. 9. Factored Operating System(FOS) • New OS should provide scalability, elasticity, fault tolerance, simple programming model – Single system image OS – Micro kernel, OS services run in user space, and communication via messages – Each service consists a group of server, called fleet, that are distributed among the underlying cores and machines – Message passing is mapped transparently across cores and machinesJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  10. 10. Single System Image • Ease of administration • Consistency • Transparent sharing • Fault tolerance • Informed optimizationsJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  11. 11. Architecture of FOS • Microkernel – messaging, name cache, time multiplexing of cores, API • Messaging – Inter-process communication and synchronization – Each process has a number of mailbox • Naming – All servers within a fleet register under a given name • OS Services – Fleet: spatially distributed, cooperating servers – FS fleet, naming fleet, scheduling fleet, proxy network server fleet…Jiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  12. 12. Architecture of FOSJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  13. 13. Messaging • Messaging on shared memory or over network • Transparent intra- and inter-machine communication • Force programmers think carefully about the amount of shared dataJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  14. 14. Parallel Data Structure • Managing state associated with a particular service among the members of the fleet • Common container interface: abstracts several implementations that provide different consistency, replication and performance properties • Existing solutions in the P2P communityJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  15. 15. Case Study – File SystemJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  16. 16. Case Study – Spawning Sever Create new server process on – decided by spawn server • Same VM • Another Existing VM – spawn1–>proxy1->proxy2->spwan2 • New VM – Create vm, send request to cloud manager – Add vm to group, exchange name information, notify all other machines – Forward spawn request to vmJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  17. 17. Case Study – Elastic Fleet • A watchdog process monitoring the queue length • Add server to fleet – Spawn, handshaking, • Make global decisions of elastic fleetJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  18. 18. Implementation • Xen para-virtualized machine (PVM) OS • Run on EC2 or Eucalyptus cloud infrastructure • Configuration – 16 machine cluster, each has 8 cores running at 3.16 GHz, 8G main memory, 1G EthernetJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  19. 19. Result - syscallJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  20. 20. Result – fos network stack & appJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  21. 21. Result - FSJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  22. 22. Related Works • Traditional microkernels. – Like: Mach, L4, … • Distributed Oses. – Like: Amoeba, Sprite, and Clouds • Cloud computing infrastructure. – Like: Google AppEngine, and MS AzureJiannan Ouyang, CS Ph.D.@PITT 5/18/2011
  23. 23. Conclusion • Cloud computing and multicores have created new classes of platforms for application development; • Fos seeks to surmount these issues by presenting a single system interface to the user and by providing a programming model that allows OS system services to scale with demand; • Fos is scalable and adaptive;Jiannan Ouyang, CS Ph.D.@PITT 5/18/2011