A Unified Execution Model for Cloud Computing


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

A Unified Execution Model for Cloud Computing

  1. 1. Eric Van Hensbergen & Phillip Stanley-Marbell - IBM Research Noah Paul Evans - Alcatel-Lucent Bell-Labs 10 October 2009 - ACM SIGOPS LADIS 2009 Workshop A Unified Execution Model for Cloud Computing © 2009 IBM Corporation
  2. 2. A Unified Execution Model for Cloud Computing Motivation SaaS PaaS IaaS Clouds have renewed interest in flexible distributed models of interacting with computer resources. The true power of the cloud will only be realized by enabling cloud aware applications and cloud aware operating systems which are able to directly leverage the power and flexibility of the cloud as well as enabling reliability and scalability. We believe that this is a systems problem best solved through standard interfaces and system calls provided by the operating system in order to not couple the facility to particular programming language or runtime middleware. GOAL: break down the barriers between infrastructure management and traditional operating system resource management, creating a cohesive interface to request, manage and release resources from within the cloud. 2 © 2009 IBM Corporation
  3. 3. A Unified Execution Model for Cloud Computing Related Work Distributed Operating Systems –V, Amoeba, Eden, Cambridge Distributed Computing System, etc. XCPU & other HPC Distributed Middleware Snowflock PaaS – Google App Engine, Microsoft Azure 3 © 2009 IBM Corporation
  4. 4. A Unified Execution Model for Cloud Computing Implementation Overview  Interfaces to provisioning, execution, monitoring and control maintained as a synthetic file system –UNIX proc -> Plan 9 Proc -> XCPU Proc -> UEM  Brasil Co-Operating System –runs hosted on Linux, OSX, BSD, Windows, Plan 9 –exports interfaces into the native file system –applications and runtimes can build library interfaces  Can use Zeroconf to locate other nodes or can be parameterized to contact a “parent”  Peer node UEM namespaces are union mounted allowing for transitive access to nodes on separate network segments via gateway nodes  Modular Infrastructure with multiple levels of plug-ins –Hierarchical Organization –Policy Engines governing requests and manipulating in-flight resources –Back-end resource providers (physical infrastructure, hypervisors, virtual machines) 4 © 2009 IBM Corporation
  5. 5. A Unified Execution Model for Cloud Computing Synthetic File Systems  All interaction is through a hierarchy of synthetic file systems –language independent & easily distributable –natural aggregation and security models  Multiple hierarchical organizations with different scope rules are possible –Canonical organization is typically based off network hierarchy –Exploring task-based logical hierarchies combining logical nodes & threads  Parent levels provided aggregate interfaces for controlling children  Beyond static hierarchies - web inspired query paths can be used to perform provisioning and aggregate control % ls /proc/query/x86/gpus=1/mem=4G # will return a list of physical systems matching 0/ 1/ 2/ % echo kill > /proc/query/user=ericvh/ctl # will terminal all LPARs and/or threads belonging # to user ericvh % echo cat /proc/query/os=linux/status # returns status of all Linux logical partitions node01 up 10:35, 4 users, load average: 1.08 node02 up 05:23, 1 users, load average: 0.50 node03 down ... 5 © 2009 IBM Corporation
  6. 6. A Unified Execution Model for Cloud Computing Different Organizational Models /proc/phys/ /proc/net/ /proc/logical/ rack01 ibm.com ericvh chassis01 research clone chassis02 www map-reduce blade01 www-01 clone node01 www-02 node02 clone lpar04 clone clone clone 0/ worker00/ 0/ ctl ctl ctl info info stderr stderr info stdin stdin stderr stdout stdout stdin n/ stdout workerxx/ n/ 6 © 2009 IBM Corporation
  7. 7. A Unified Execution Model for Cloud Computing Execution & Control Mechanisms  At leaf-level, the interface is very similar to the XCPU2 [CLUSTERS 09] interface clone  clone is used to establish a new instance <pid>  ctl is a command console for controlling execution and other provisioning operations (triggering ctl migration, network splicing, etc.) status  ns provides fs/storage namespace definitions stderr  env & args provide environment variables and stdin execution arguments stdout  wait can be blocked upon to wait on program/ ns LPAR termination nsgroup  debug subdirectory provides direct access to debug/ memory, and as appropriate stack, registers, and env could be leveraged to provide direct access to args external debuggers (acid, gdb, etc.) wait  stdin/stdout/stderr provide console for LPARS n/ 7 © 2009 IBM Corporation
  8. 8. A Unified Execution Model for Cloud Computing Aggregation  Large Scale Distributed Systems –Scale is principle research challenge  Infrastructure topology best organized along natural aggregation points –BG/P we have an I/O node for every 64 Compute Nodes –Traditional Clusters can be organized by chassis or rack at similar density –Modern virtualized environments typically have a Dom0 or Host as an aggregation point –Multi-level dynamic aggregation also appealing for execution & other services [MTAGS08]  Aggregated Synthetic File System Interfaces –Information and Control Files at all levels of the hierarchy –Information based synthetic files provided aggregated output • can be XML/s-rec style aggregation, or could be application/task specific –Commands sent to control files multiplexed to lower control files • can leverage broadcast/multicast or other interconnect optimizations –Allows altering granularity based on workload requirement 8 © 2009 IBM Corporation
  9. 9. A Unified Execution Model for Cloud Computing Abstracted Communications & Network Splicing “This is the Unix philosophy. Write programs that do one thing and do it well. Write programs to work together.” - Doug McIlroy !"#$%&'  UNIX Pipelines !"#(# –cat file | sort -n -r | uniq | more !"#$%& !"#$%&' !"#(# !"#(# !"#$%&' !"#(#  PUSH Multipipes [PODC09] !"#$%&' –ls |< cat | sort -n -r |#| uniq >| sort -n -r | more !"#(# –New shell operators !"#$%& !"#$%& !"#$%&' !"#(# !"#(# !"#(# • |< - Fan Out Pipe (One to Many) !"#$%&' !"#(# • |#| - Hash Pipe (Many to Many) !"#$%&' • >| - Fan In Pipe (Many to One) !"#(# !"#$%&' !"#$%& !"#(# !"#(#  Requires the ability to splice standard I/O between remote nodes !"#$%&' !"#(# –need control abstractions within execution model to facilitate  Example: # proc102 | proc56 | proc256 # usage: splice <proc_path> <src_fd> <local_fd> % echo splice /proc/net/remote.com/102 1 0 > /proc/net/mybox.com/56/ctl % echo splice /proc/net/mybox.com/56 1 0 > /proc/net/farther.com/256/ctl 9 © 2009 IBM Corporation
  10. 10. A Unified Execution Model for Cloud Computing Policy Modules  Pluggable –Can be user-defined, application-defined, and administrator defined  Protective –keep rogue threads from allocating obnoxious number of resources  Task driven resource allocation –eliminate double clutch (allocate node, execute task) –probably should avoid task migration (historically hasn’t worked out) • may be easier to migrate LPARs versus migrate tasks –deriving resource requirements of a particular task may be difficult  Resource Management –Consolidate LPARs from underutilized hardware –Colocate LPARs with heavy communication  Resiliency –spread out redundant LPARs to different availability domains 10 © 2009 IBM Corporation
  11. 11. A Unified Execution Model for Cloud Computing Status & Future Work  Still undergoing rapid development/re-development –Support for BlueGene/P both via Kittyhawk Linux & HARE Plan 9 Environments –Preliminary support for EC2 compliant cloud APIs  Will be released as open-source once some degree of stability is achieved –http://www.research.ibm.com/hare  Future Work –Larger scale application deployments –Integration with Eucalyptus environment –Integration with a Virtual Machine as an additional level of the execution hierarchy –Build a runtime environment on top of UEM • Potentially combine UEM facilities with OpenCL or other language –Integration of policy mechanisms/agents for load balancing, power aware, and migration based optimization –Integration of abstracted support nodes (file servers, database servers, etc.) –Need policy models for dynamically adapting in the face of failure, resource changes –Heterogeneous Architectures - GPU & Cell Hybrid Systems [PPAC09] 11 © 2009 IBM Corporation
  12. 12. A Unified Execution Model for Cloud Computing Acknowledgements This work has been supported in part by the Department of Energy Office of Science Operating and Runtime Systems for Extreme Scale Scientific Computation project under contract #DE-FG02-08ER25851 Thanks! Questions? 12 © 2009 IBM Corporation
  13. 13. BACKUP SLIDES © 2009 IBM Corporation
  14. 14. A Unified Execution Model for Cloud Computing dot-dot-dot namespace shortcuts /proc/logical/ /proc/phys/ ericvh rack01 clone chassis01 map-reduce chassis02 clone % pwd blade01 node01 /proc/logical/ericvh/map-reduce/ node02 node02/worker00/ clone % cd .../phys clone % pwd /proc/phys/rack01/chassis02/blade01/ lpar04 worker00/ lpar04/0/ clone 0/ ctl info ctl stderr info stdin stderr stdout stdin workerxx/ stdout n/ 14 © 2009 IBM Corporation
  15. 15. A Unified Execution Model for Cloud Computing 1 2 1 2 3 3 4 4 Issues commands to UEMs of hosts 1– 4 5 to initiate splices. 5 Central Splicing Distributed Splicing 15 © 2009 IBM Corporation
  16. 16. A Unified Execution Model for Cloud Computing Time Evolving File System (filesystem) /appfs/ /appfs/ (filesystem) Interface /appfs/ /appfs/ /gpufs/ Interface new new new new new 0/ 0/ 0/ 0/ n/ write ctl ctl write ctl ctl ctl data data data data data (fileserver processes) Fileserver Fileserver Implementation (fileserver processes) Process Process Fileserver Fileserver Implementation UEM fileserver fileserver Process Process main() main() fileserver fileserver main() main() function call or new process to GPU call to UEM stream data to GPU accelerator system to splice for processing main() gpufs/n/data with appfs/0/data 16 © 2009 IBM Corporation