SlideShare a Scribd company logo
1 of 80
Download to read offline
Scalable Elastic System
 Architecture (SESA)
      Dan Schatzberg, Boston University
     Jonathan Appavoo, Boston University
            Orran Krieger,VMware
  Eric Van Hensbergen, IBM Research Austin
The goal


Perform more computation with fewer
            resources
Fixed Resources

• Hardware as a fixed resource
• Focus on reducing computation’s need for
  hardware resources
• Multiplex hardware resources for different
  computations
Elastic Resources

• Cloud Computing
 • Pay as you go hardware
• Focus on providing hardware to the
  computation that requires it
Time to scale hardware

Days              Minutes

  Fixed
               Cloud Computing
Hardware
Time to scale hardware

Days              Minutes

  Fixed
               Cloud Computing
Hardware


              Elastic Applications
Time to scale hardware

Days              Minutes            Milliseconds

  Fixed
               Cloud Computing
Hardware
                                          ?
              Elastic Applications
Interactive HPC

• Medical imaging application
 • interactive
 • 1 megapixel image
 • quadratic memory consumption - ~14TB
Interactive HPC


• Fixed Hardware
 • Purchase a cluster
Interactive HPC

• Cloud Computing
 • Allocate a cluster
 • Maintain interactivity
 • 650+ EC2 instances - $8000 dollars / 8
    hour day
Can we do better?
Where we’re starting


  Treat elasticity as a first-class system
               characteristic
OUTLINE
1. THE PROBLEM
2. OBSERVATIONS
 1. Top-Down Demand
 2. Bottom-Up Support
 3. Modularity
3. OUR TAKE ON A SOLUTION
4. PROTOTYPE & CHALLENGES
Top-Down Demand
System Interface




        Software


  Hardware
Top-Down Demand
System Interface




  Hardware
Top-Down Demand
System Interface




  Hardware
Top-Down Demand
System Interface




  Hardware
Top-Down Demand
System Interface




  Hardware
Top-Down Demand
System Interface




  Hardware
Events as Load
• Treat a service request as an event that is
  dispatched to resources
• As events occur, load increases
• As events are handled, load decreases
• Each layer being event-driven forces
  demand to flow top-down
Bottom-Up Support
System Interface




  Hardware
Bottom-Up Support
System Interface




  Hardware
                   Allocate/Deallocate
                                   Resources
Bottom-Up Support
System Interface




  Hardware
Elastic Interface
• Support elasticity by interfacing via
  allocation and deallocation of physical or
  logical resources
• Each layer is constructed by being explicit
  with respect to resource consumption
• Be explicit with respect to time to meet a
  request
Modularity
System Interface




  Hardware
Modularity
System Interface




  Hardware
Modularity
System Interface




  Hardware
Modularity
System Interface




  Hardware
Object model
• Objects can take advantage
 • the semantics of their request patterns
 • the lifetime of an instance
 • the occupancy w.r.t memory, processing
    and communication
• We can optimize for elasticity by taking
  advantage of modularity in a system
OUTLINE
1. THE PROBLEM
2. OBSERVATIONS
3. OUR TAKE ON A SOLUTION : SESA
 1. EBB’s : Elastic Building Blocks
 2. SEE: Scalable Elastic Executive - A LibOS
 3. EPIC: Events as Interrupts
4. PROTOTYPE & CHALLENGES
Architecture Overview




       Hardware
Architecture Overview




                     FAWN



               SSD
Architecture Overview



       Partitioning

                            FAWN



                      SSD
Architecture Overview




           Kittyhawk

                             FAWN



                       SSD
Architecture Overview


      System Software



              Kittyhawk

                                FAWN



                          SSD
Architecture Overview

                               HAL




           Kittyhawk

                             FAWN



                       SSD
Architecture Overview
       Applications


                                  HAL




              Kittyhawk

                                FAWN



                          SSD
Architecture Overview
                             ?
                                   HAL




           Kittyhawk

                                 FAWN



                       SSD
Architecture Overview
                             ?
                                   HAL




           Kittyhawk

                                 FAWN



                       SSD
SESA
        SE APP/SERVICE
        EBB Namespace            System Software Layers
                                      Component Layer


          SEExecutive
 SEE      SEE             SEE             LibOS Layer

          SEMachine
SEHAL    SEHAL           SEHAL      Hardware Abstraction Layer

VM/      VM/             VM/
Node     Node            Node      Partitioning Layer
  Elastic Partition of Nodes
EBB’s
                A new Component Model
                   for expressing and
EBB NameSpace    encapsulating fine grain
                        elasticity.

                The Next Generation of
                  Clustered Objects.
Clustered Objects (CO)
                            dref(ctr)->inc();

                    val()                                                             dec()
        de




                      de




                                    de




                                                  de




                                                                 de




                                                                                de




                                                                                              de




                                                                                                            de
val




              val




                            val




                                          val




                                                         val




                                                                        val




                                                                                      val




                                                                                                    val
          c




                        c




                                      c




                                                    c




                                                                   c




                                                                                  c




                                                                                                c




                                                                                                              c
       C             C             C             C              C              C             C             C
      inc           inc           inc           inc            inc            inc           inc           inc

                                                      inc()
                                  Processors                     Processors


                                   c c c c
                                    Memory                           cMemory
                                                                       c c c
What did we learn?


• Event-driven architecture for lazy and
  dynamic instantiation of resources
• Mechanism to create scalable software
Elastic Building Blocks
• Programming Model for
  Elastic and Scalable
  Components
• Span multiple nodes
• Built in On Demand nature
  -- encapsulation of policies
  for both allocation and
  deallocation of resources
SEE

                          A Distributed Library OS
                          Model designed to enable
      SEExecutive         Elastic Software within the
SEE   SEE           SEE
                               context of legacy
                                 environments.

                          Next Generation of Libra
Libra
                    Architecture
     Controller      Application    Application   Application       vironment for both u
     Partition       Partition       Partition     Partition        tition is launched fro
                                                                    hypervisor to create a
  App



                                                                    This script also launc
                    App                                             tion to access the con
                    App
                    App
        Gateway
                                                                        The gateway serv
                                                                    which is a compact op




                                                  App
                                                  App
                                                  App
   General                            DB                            systems. Inferno crea
   Purpose            JVM                                           services such as the u
                                                                    the network (see Fig
    OS                Libra          Libra         Libra            pace remotely via the
                                                                    over a shared-memor
                                                                    and application parti
                      Hypervisor                                    tensions to Inferno a
            Figure 1. Proposed system architecture.                 transport is available
                                                                        Note that nothing
                                                                    access all resources
nels, hypervisors run other operating systems with few or no mod-   allows resources and
ifications [27, 5, 42]. By running an operating system (the con-     partition and accesse
Libra
                                  X86 Linux Front Ends
Pool of Libra Partitions
                                          $

                             9p

                                          $




PowerPC Blades: Libra Workers
Libra
                                  X86 Linux Front Ends
Pool of Libra Partitions
                                           $ java -cp my.jar
                             9p

                                          $




PowerPC Blades: Libra Workers
Libra
                                  X86 Linux Front Ends
Pool of Libra Partitions
                                           $ java -cp my.jar
                             9p

                                          $




PowerPC Blades: Libra Workers
Libra
                                  X86 Linux Front Ends
Pool of Libra Partitions
                                        $ java -cp my.jar
                             9p

                                       $ for ((i=0;i<44;i++))
                                       do
                                          java -cp my.jar &
                                       done



PowerPC Blades: Libra Workers
Libra
                                  X86 Linux Front Ends
Pool of Libra Partitions
                                        $ java -cp my.jar
                             9p

                                       $ for ((i=0;i<44;i++))
                                       do
                                          java -cp my.jar &
                                       done



PowerPC Blades: Libra Workers
Libra
                                X86 Linux Front Ends
Pool of Libra Partitions
                                        $ java -cp my.jar

                                   9p
                                        $ for ((i=0;i<44;i++))
                                        do
                                           java -cp my.jar &
                                        done



PowerPC Blades: Libra Workers
What did we learn?

• Specialized environment for each
  application
• Lightweight system layer implementing
  services for performance
• General purpose OS for non-performance
  critical services
SEE : A LibOS for SESA
• Distributed LibOS that
  can elastically span
                                    Per-node EBB manifestations
  nodes
                                  Scalable Elastic Executive (SEE)
• Instances cooperate to    EBB Infrastructure
                                                        FS Name Space
  support the allocation                                 Protocol (9p)

  and deallocation of      locality aware
                                                               inter-node
                                                event       communication
  EBB’s                       memory
                                             dispatcher      protocols and
                             allocator
                                                                primitives
• Enables compatibility
  with Front End nodes                         SEHAL
  running via unified 9p
  namespace
SEMachines and EPICs



         SEMachine           Hardware Abstraction Layer :
SEHAL   SEHAL        SEHAL             EPIC
Programmable
Interrupt Controller
          Source

              Interrupt

      0   1    1    0



               Execution
Programmable
Interrupt Controller
          Source

              Interrupt

      1   1    1    0



               Execution
Programmable
Interrupt Controller
          Source

               Event

      1   1   1      0



                   Action
Elastic Programmable
Interrupt Controller
              Source

                   Event

  1   0   1   1   1      0      0   0



                       Action
Elastic Programmable
    Interrupt Controller
                Source

                        Event

                  ...
1   0   1   1                     0   1   1   1

                  ...
                         Action
                  ...
Elastic Programmable
 Interrupt Controller
• Programmed by the SEE
• Provides the minimum requirement of
  elastic applications - mapping load to
  resources
• Portable layer
• Take advantage of network features such as
  broadcast and multicast
OUTLINE

1. THE PROBLEM
2. OBSERVATIONS
3. OUR TAKE ON A SOLUTION
4. PROTOTYPE & CHALLENGES
PROTOTYPE APP
             SESA SAGE                                 Traditional HW
Sage*         SERVICES

             Elastic




                       SEE: EBB’s + EHAL
        OL   Matrix
             Cache
                                                       Advanced HW
             Elastic
             Matrix

                                           Kittyhawk
              Ops
Challenges and
 Discussion
OUTLINE
1. THE PROBLEM
 1. Pay as you go computing
 2. Insufficient systems support for elasticity
2. OBSERVATIONS
3. OUR TAKE ON A SOLUTION
4. PROTOTYPE & CHALLENGES
Pay as you go hardware

            Software
Consumer

Provider
Pay as you go hardware

             Software
Consumer
   Request


Provider
Pay as you go hardware

            Software
Consumer

Provider
Elastic Website
              Load Balancer

Consumer

Provider
Elastic Website
              Load Balancer

Consumer

Provider
Elastic Website
              Load Balancer

Consumer

Provider
Other Elastic
        Applications

• Analytics
• Batch computation
• Stream processing
What’s the problem?


• Allocation/Boot-time
• Programmability
Medical Imaging
       Application
• Megapixel image
• Quadratic algorithm
• (1 mil pixels * 4 bytes/pixel)^2 ~ 14 TB
• On Amazon EC2 ~ $8000 per day
Snowflock


Consumer

Provider
Snowflock


Consumer

Provider
Snowflock


Consumer

Provider
Distributing an Object
Non-Distributed Object Instance


   Region List Lock   L
        Region List       R0


    Other
                          R1
     Data
  Structures
                          R2
L
            R0
    Root         Rep0
L
       R0
                              L
                        R1

       R1                    Rep1
                   L
       R2   R0



            R2   Rep2
Elastic Programmable
Interrupt Controller
          Source

               Event

      1   1   1      0



                   Action

More Related Content

What's hot

VIOS in action with IBM i
VIOS in action with IBM i VIOS in action with IBM i
VIOS in action with IBM i COMMON Europe
 
ScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAs
ScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAsScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAs
ScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAsShinya Takamaeda-Y
 
Presentation power vm common 2012
Presentation   power vm common 2012Presentation   power vm common 2012
Presentation power vm common 2012solarisyougood
 
Preparing Codes for Intel Knights Landing (KNL)
Preparing Codes for Intel Knights Landing (KNL)Preparing Codes for Intel Knights Landing (KNL)
Preparing Codes for Intel Knights Landing (KNL)AllineaSoftware
 
Presentation power vm editions and power systems virtualization - basic
Presentation   power vm editions and power systems virtualization - basicPresentation   power vm editions and power systems virtualization - basic
Presentation power vm editions and power systems virtualization - basicsolarisyougood
 
Develop, Deploy, and Innovate with Intel® Cluster Ready
Develop, Deploy, and Innovate with Intel® Cluster ReadyDevelop, Deploy, and Innovate with Intel® Cluster Ready
Develop, Deploy, and Innovate with Intel® Cluster ReadyIntel IT Center
 
IBM i client partitions concepts and implementation
IBM i client partitions concepts and implementationIBM i client partitions concepts and implementation
IBM i client partitions concepts and implementationCOMMON Europe
 
PowerVM Live Partition Mobility in IBM PureFlex
PowerVM Live Partition Mobility in IBM PureFlexPowerVM Live Partition Mobility in IBM PureFlex
PowerVM Live Partition Mobility in IBM PureFlexLuca Comparini
 
Aix The Future of UNIX
Aix The Future of UNIX Aix The Future of UNIX
Aix The Future of UNIX xKinAnx
 
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation AMD
 
Marvell SR-IOV Improves Server Virtualization Performance
Marvell SR-IOV Improves Server Virtualization PerformanceMarvell SR-IOV Improves Server Virtualization Performance
Marvell SR-IOV Improves Server Virtualization PerformanceMarvell
 
Re usable continuous-time analog sva assertions - slides
Re usable continuous-time analog sva assertions - slidesRe usable continuous-time analog sva assertions - slides
Re usable continuous-time analog sva assertions - slidesRégis SANTONJA
 
Student guide power systems for aix - virtualization i implementing virtual...
Student guide   power systems for aix - virtualization i implementing virtual...Student guide   power systems for aix - virtualization i implementing virtual...
Student guide power systems for aix - virtualization i implementing virtual...solarisyougood
 
Panasas pNFS Status (September 2010)
Panasas pNFS Status (September 2010)Panasas pNFS Status (September 2010)
Panasas pNFS Status (September 2010)Panasas
 
Mixed signal verification challenges - slides
Mixed signal verification challenges - slidesMixed signal verification challenges - slides
Mixed signal verification challenges - slidesRégis SANTONJA
 

What's hot (20)

VIOS in action with IBM i
VIOS in action with IBM i VIOS in action with IBM i
VIOS in action with IBM i
 
ScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAs
ScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAsScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAs
ScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAs
 
Presentation power vm common 2012
Presentation   power vm common 2012Presentation   power vm common 2012
Presentation power vm common 2012
 
101 cd 1415-1445
101 cd 1415-1445101 cd 1415-1445
101 cd 1415-1445
 
Preparing Codes for Intel Knights Landing (KNL)
Preparing Codes for Intel Knights Landing (KNL)Preparing Codes for Intel Knights Landing (KNL)
Preparing Codes for Intel Knights Landing (KNL)
 
Technology (1)
Technology (1)Technology (1)
Technology (1)
 
Presentation power vm editions and power systems virtualization - basic
Presentation   power vm editions and power systems virtualization - basicPresentation   power vm editions and power systems virtualization - basic
Presentation power vm editions and power systems virtualization - basic
 
Develop, Deploy, and Innovate with Intel® Cluster Ready
Develop, Deploy, and Innovate with Intel® Cluster ReadyDevelop, Deploy, and Innovate with Intel® Cluster Ready
Develop, Deploy, and Innovate with Intel® Cluster Ready
 
Windows Server 2012 Hyper-V Networking Evolved
Windows Server 2012 Hyper-V Networking Evolved Windows Server 2012 Hyper-V Networking Evolved
Windows Server 2012 Hyper-V Networking Evolved
 
IBM i client partitions concepts and implementation
IBM i client partitions concepts and implementationIBM i client partitions concepts and implementation
IBM i client partitions concepts and implementation
 
PowerVM Live Partition Mobility in IBM PureFlex
PowerVM Live Partition Mobility in IBM PureFlexPowerVM Live Partition Mobility in IBM PureFlex
PowerVM Live Partition Mobility in IBM PureFlex
 
Aix The Future of UNIX
Aix The Future of UNIX Aix The Future of UNIX
Aix The Future of UNIX
 
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
 
Marvell SR-IOV Improves Server Virtualization Performance
Marvell SR-IOV Improves Server Virtualization PerformanceMarvell SR-IOV Improves Server Virtualization Performance
Marvell SR-IOV Improves Server Virtualization Performance
 
Re usable continuous-time analog sva assertions - slides
Re usable continuous-time analog sva assertions - slidesRe usable continuous-time analog sva assertions - slides
Re usable continuous-time analog sva assertions - slides
 
Intel tools to optimize HPC systems
Intel tools to optimize HPC systemsIntel tools to optimize HPC systems
Intel tools to optimize HPC systems
 
Student guide power systems for aix - virtualization i implementing virtual...
Student guide   power systems for aix - virtualization i implementing virtual...Student guide   power systems for aix - virtualization i implementing virtual...
Student guide power systems for aix - virtualization i implementing virtual...
 
27ian2011 hp
27ian2011   hp27ian2011   hp
27ian2011 hp
 
Panasas pNFS Status (September 2010)
Panasas pNFS Status (September 2010)Panasas pNFS Status (September 2010)
Panasas pNFS Status (September 2010)
 
Mixed signal verification challenges - slides
Mixed signal verification challenges - slidesMixed signal verification challenges - slides
Mixed signal verification challenges - slides
 

Similar to Scalable Elastic Systems Architecture (SESA)

OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosBrent Salisbury
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVMJohn Lee
 
9th docker meetup 2016.07.13
9th docker meetup 2016.07.139th docker meetup 2016.07.13
9th docker meetup 2016.07.13Amrita Prasad
 
Bangalore cloudstack user group
Bangalore cloudstack user groupBangalore cloudstack user group
Bangalore cloudstack user groupShapeBlue
 
Instrumenting the real-time web
Instrumenting the real-time webInstrumenting the real-time web
Instrumenting the real-time webbcantrill
 
Learn OpenStack from trystack.cn ——Folsom in practice
Learn OpenStack from trystack.cn  ——Folsom in practiceLearn OpenStack from trystack.cn  ——Folsom in practice
Learn OpenStack from trystack.cn ——Folsom in practiceOpenCity Community
 
20110507 Implementing Continuous Deployment
20110507 Implementing Continuous Deployment20110507 Implementing Continuous Deployment
20110507 Implementing Continuous DeploymentXebiaLabs
 
Ram chinta hug-20120922-v1
Ram chinta hug-20120922-v1Ram chinta hug-20120922-v1
Ram chinta hug-20120922-v1Ram Chinta
 
Build and Deploy Cloud Native Camel Quarkus routes with Tekton and Knative
Build and Deploy Cloud Native Camel Quarkus routes with Tekton and KnativeBuild and Deploy Cloud Native Camel Quarkus routes with Tekton and Knative
Build and Deploy Cloud Native Camel Quarkus routes with Tekton and KnativeOmar Al-Safi
 
Introduction to Container Management on AWS
Introduction to Container Management on AWSIntroduction to Container Management on AWS
Introduction to Container Management on AWSAmazon Web Services
 
Kamaelia-ACCU-20050422
Kamaelia-ACCU-20050422Kamaelia-ACCU-20050422
Kamaelia-ACCU-20050422journeyer
 
Windows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing PlatformWindows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing PlatformEduardo Castro
 
[NYC Meetup] Docker at Nuxeo
[NYC Meetup] Docker at Nuxeo[NYC Meetup] Docker at Nuxeo
[NYC Meetup] Docker at NuxeoNuxeo
 
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...Amazon Web Services
 
Paul Angus - CloudStack Container Service
Paul  Angus - CloudStack Container ServicePaul  Angus - CloudStack Container Service
Paul Angus - CloudStack Container ServiceShapeBlue
 
Method of NUMA-Aware Resource Management for Kubernetes 5G NFV Cluster
Method of NUMA-Aware Resource Management for Kubernetes 5G NFV ClusterMethod of NUMA-Aware Resource Management for Kubernetes 5G NFV Cluster
Method of NUMA-Aware Resource Management for Kubernetes 5G NFV Clusterbyonggon chun
 

Similar to Scalable Elastic Systems Architecture (SESA) (20)

OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow Demos
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVM
 
9th docker meetup 2016.07.13
9th docker meetup 2016.07.139th docker meetup 2016.07.13
9th docker meetup 2016.07.13
 
Bangalore cloudstack user group
Bangalore cloudstack user groupBangalore cloudstack user group
Bangalore cloudstack user group
 
Instrumenting the real-time web
Instrumenting the real-time webInstrumenting the real-time web
Instrumenting the real-time web
 
Istio presentation jhug
Istio presentation jhugIstio presentation jhug
Istio presentation jhug
 
Kubernetes 1001
Kubernetes 1001Kubernetes 1001
Kubernetes 1001
 
Learn OpenStack from trystack.cn ——Folsom in practice
Learn OpenStack from trystack.cn  ——Folsom in practiceLearn OpenStack from trystack.cn  ——Folsom in practice
Learn OpenStack from trystack.cn ——Folsom in practice
 
20110507 Implementing Continuous Deployment
20110507 Implementing Continuous Deployment20110507 Implementing Continuous Deployment
20110507 Implementing Continuous Deployment
 
Ram chinta hug-20120922-v1
Ram chinta hug-20120922-v1Ram chinta hug-20120922-v1
Ram chinta hug-20120922-v1
 
Build and Deploy Cloud Native Camel Quarkus routes with Tekton and Knative
Build and Deploy Cloud Native Camel Quarkus routes with Tekton and KnativeBuild and Deploy Cloud Native Camel Quarkus routes with Tekton and Knative
Build and Deploy Cloud Native Camel Quarkus routes with Tekton and Knative
 
Introduction to Container Management on AWS
Introduction to Container Management on AWSIntroduction to Container Management on AWS
Introduction to Container Management on AWS
 
Kamaelia-ACCU-20050422
Kamaelia-ACCU-20050422Kamaelia-ACCU-20050422
Kamaelia-ACCU-20050422
 
2012 11 Openstack China
2012 11 Openstack China2012 11 Openstack China
2012 11 Openstack China
 
Kubernetes @ meetic
Kubernetes @ meeticKubernetes @ meetic
Kubernetes @ meetic
 
Windows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing PlatformWindows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing Platform
 
[NYC Meetup] Docker at Nuxeo
[NYC Meetup] Docker at Nuxeo[NYC Meetup] Docker at Nuxeo
[NYC Meetup] Docker at Nuxeo
 
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...
AWS re:Invent 2016: From Monolithic to Microservices: Evolving Architecture P...
 
Paul Angus - CloudStack Container Service
Paul  Angus - CloudStack Container ServicePaul  Angus - CloudStack Container Service
Paul Angus - CloudStack Container Service
 
Method of NUMA-Aware Resource Management for Kubernetes 5G NFV Cluster
Method of NUMA-Aware Resource Management for Kubernetes 5G NFV ClusterMethod of NUMA-Aware Resource Management for Kubernetes 5G NFV Cluster
Method of NUMA-Aware Resource Management for Kubernetes 5G NFV Cluster
 

More from Eric Van Hensbergen (20)

Scaling Arm from One to One Trillion
Scaling Arm from One to One TrillionScaling Arm from One to One Trillion
Scaling Arm from One to One Trillion
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
 
ISC14 Embedded HPC BoF Panel Presentation
ISC14 Embedded HPC BoF Panel PresentationISC14 Embedded HPC BoF Panel Presentation
ISC14 Embedded HPC BoF Panel Presentation
 
Brasil Ross 2011
Brasil Ross 2011Brasil Ross 2011
Brasil Ross 2011
 
Multipipes
MultipipesMultipipes
Multipipes
 
Multi-pipes
Multi-pipesMulti-pipes
Multi-pipes
 
VirtFS
VirtFSVirtFS
VirtFS
 
HARE 2010 Review
HARE 2010 ReviewHARE 2010 Review
HARE 2010 Review
 
PUSH-- a Dataflow Shell
PUSH-- a Dataflow ShellPUSH-- a Dataflow Shell
PUSH-- a Dataflow Shell
 
XCPU3: Workload Distribution and Aggregation
XCPU3: Workload Distribution and AggregationXCPU3: Workload Distribution and Aggregation
XCPU3: Workload Distribution and Aggregation
 
9P Code Walkthrough
9P Code Walkthrough9P Code Walkthrough
9P Code Walkthrough
 
9P Overview
9P Overview9P Overview
9P Overview
 
Push Podc09
Push Podc09Push Podc09
Push Podc09
 
Libra: a Library OS for a JVM
Libra: a Library OS for a JVMLibra: a Library OS for a JVM
Libra: a Library OS for a JVM
 
Effect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEffect of Virtualization on OS Interference
Effect of Virtualization on OS Interference
 
PROSE
PROSEPROSE
PROSE
 
Libra Library OS
Libra Library OSLibra Library OS
Libra Library OS
 
Systems Support for Many Task Computing
Systems Support for Many Task ComputingSystems Support for Many Task Computing
Systems Support for Many Task Computing
 
Holistic Aggregate Resource Environment
Holistic Aggregate Resource EnvironmentHolistic Aggregate Resource Environment
Holistic Aggregate Resource Environment
 
Paravirtualized File Systems
Paravirtualized File SystemsParavirtualized File Systems
Paravirtualized File Systems
 

Scalable Elastic Systems Architecture (SESA)

  • 1. Scalable Elastic System Architecture (SESA) Dan Schatzberg, Boston University Jonathan Appavoo, Boston University Orran Krieger,VMware Eric Van Hensbergen, IBM Research Austin
  • 2. The goal Perform more computation with fewer resources
  • 3. Fixed Resources • Hardware as a fixed resource • Focus on reducing computation’s need for hardware resources • Multiplex hardware resources for different computations
  • 4. Elastic Resources • Cloud Computing • Pay as you go hardware • Focus on providing hardware to the computation that requires it
  • 5. Time to scale hardware Days Minutes Fixed Cloud Computing Hardware
  • 6. Time to scale hardware Days Minutes Fixed Cloud Computing Hardware Elastic Applications
  • 7. Time to scale hardware Days Minutes Milliseconds Fixed Cloud Computing Hardware ? Elastic Applications
  • 8. Interactive HPC • Medical imaging application • interactive • 1 megapixel image • quadratic memory consumption - ~14TB
  • 9. Interactive HPC • Fixed Hardware • Purchase a cluster
  • 10. Interactive HPC • Cloud Computing • Allocate a cluster • Maintain interactivity • 650+ EC2 instances - $8000 dollars / 8 hour day
  • 11. Can we do better?
  • 12. Where we’re starting Treat elasticity as a first-class system characteristic
  • 13. OUTLINE 1. THE PROBLEM 2. OBSERVATIONS 1. Top-Down Demand 2. Bottom-Up Support 3. Modularity 3. OUR TAKE ON A SOLUTION 4. PROTOTYPE & CHALLENGES
  • 14. Top-Down Demand System Interface Software Hardware
  • 20. Events as Load • Treat a service request as an event that is dispatched to resources • As events occur, load increases • As events are handled, load decreases • Each layer being event-driven forces demand to flow top-down
  • 22. Bottom-Up Support System Interface Hardware Allocate/Deallocate Resources
  • 24. Elastic Interface • Support elasticity by interfacing via allocation and deallocation of physical or logical resources • Each layer is constructed by being explicit with respect to resource consumption • Be explicit with respect to time to meet a request
  • 29. Object model • Objects can take advantage • the semantics of their request patterns • the lifetime of an instance • the occupancy w.r.t memory, processing and communication • We can optimize for elasticity by taking advantage of modularity in a system
  • 30. OUTLINE 1. THE PROBLEM 2. OBSERVATIONS 3. OUR TAKE ON A SOLUTION : SESA 1. EBB’s : Elastic Building Blocks 2. SEE: Scalable Elastic Executive - A LibOS 3. EPIC: Events as Interrupts 4. PROTOTYPE & CHALLENGES
  • 33. Architecture Overview Partitioning FAWN SSD
  • 34. Architecture Overview Kittyhawk FAWN SSD
  • 35. Architecture Overview System Software Kittyhawk FAWN SSD
  • 36. Architecture Overview HAL Kittyhawk FAWN SSD
  • 37. Architecture Overview Applications HAL Kittyhawk FAWN SSD
  • 38. Architecture Overview ? HAL Kittyhawk FAWN SSD
  • 39. Architecture Overview ? HAL Kittyhawk FAWN SSD
  • 40. SESA SE APP/SERVICE EBB Namespace System Software Layers Component Layer SEExecutive SEE SEE SEE LibOS Layer SEMachine SEHAL SEHAL SEHAL Hardware Abstraction Layer VM/ VM/ VM/ Node Node Node Partitioning Layer Elastic Partition of Nodes
  • 41. EBB’s A new Component Model for expressing and EBB NameSpace encapsulating fine grain elasticity. The Next Generation of Clustered Objects.
  • 42. Clustered Objects (CO) dref(ctr)->inc(); val() dec() de de de de de de de de val val val val val val val val c c c c c c c c C C C C C C C C inc inc inc inc inc inc inc inc inc() Processors Processors c c c c Memory cMemory c c c
  • 43. What did we learn? • Event-driven architecture for lazy and dynamic instantiation of resources • Mechanism to create scalable software
  • 44. Elastic Building Blocks • Programming Model for Elastic and Scalable Components • Span multiple nodes • Built in On Demand nature -- encapsulation of policies for both allocation and deallocation of resources
  • 45. SEE A Distributed Library OS Model designed to enable SEExecutive Elastic Software within the SEE SEE SEE context of legacy environments. Next Generation of Libra
  • 46. Libra Architecture Controller Application Application Application vironment for both u Partition Partition Partition Partition tition is launched fro hypervisor to create a App This script also launc App tion to access the con App App Gateway The gateway serv which is a compact op App App App General DB systems. Inferno crea Purpose JVM services such as the u the network (see Fig OS Libra Libra Libra pace remotely via the over a shared-memor and application parti Hypervisor tensions to Inferno a Figure 1. Proposed system architecture. transport is available Note that nothing access all resources nels, hypervisors run other operating systems with few or no mod- allows resources and ifications [27, 5, 42]. By running an operating system (the con- partition and accesse
  • 47. Libra X86 Linux Front Ends Pool of Libra Partitions $ 9p $ PowerPC Blades: Libra Workers
  • 48. Libra X86 Linux Front Ends Pool of Libra Partitions $ java -cp my.jar 9p $ PowerPC Blades: Libra Workers
  • 49. Libra X86 Linux Front Ends Pool of Libra Partitions $ java -cp my.jar 9p $ PowerPC Blades: Libra Workers
  • 50. Libra X86 Linux Front Ends Pool of Libra Partitions $ java -cp my.jar 9p $ for ((i=0;i<44;i++)) do java -cp my.jar & done PowerPC Blades: Libra Workers
  • 51. Libra X86 Linux Front Ends Pool of Libra Partitions $ java -cp my.jar 9p $ for ((i=0;i<44;i++)) do java -cp my.jar & done PowerPC Blades: Libra Workers
  • 52. Libra X86 Linux Front Ends Pool of Libra Partitions $ java -cp my.jar 9p $ for ((i=0;i<44;i++)) do java -cp my.jar & done PowerPC Blades: Libra Workers
  • 53. What did we learn? • Specialized environment for each application • Lightweight system layer implementing services for performance • General purpose OS for non-performance critical services
  • 54. SEE : A LibOS for SESA • Distributed LibOS that can elastically span Per-node EBB manifestations nodes Scalable Elastic Executive (SEE) • Instances cooperate to EBB Infrastructure FS Name Space support the allocation Protocol (9p) and deallocation of locality aware inter-node event communication EBB’s memory dispatcher protocols and allocator primitives • Enables compatibility with Front End nodes SEHAL running via unified 9p namespace
  • 55. SEMachines and EPICs SEMachine Hardware Abstraction Layer : SEHAL SEHAL SEHAL EPIC
  • 56. Programmable Interrupt Controller Source Interrupt 0 1 1 0 Execution
  • 57. Programmable Interrupt Controller Source Interrupt 1 1 1 0 Execution
  • 58. Programmable Interrupt Controller Source Event 1 1 1 0 Action
  • 59. Elastic Programmable Interrupt Controller Source Event 1 0 1 1 1 0 0 0 Action
  • 60. Elastic Programmable Interrupt Controller Source Event ... 1 0 1 1 0 1 1 1 ... Action ...
  • 61. Elastic Programmable Interrupt Controller • Programmed by the SEE • Provides the minimum requirement of elastic applications - mapping load to resources • Portable layer • Take advantage of network features such as broadcast and multicast
  • 62. OUTLINE 1. THE PROBLEM 2. OBSERVATIONS 3. OUR TAKE ON A SOLUTION 4. PROTOTYPE & CHALLENGES
  • 63. PROTOTYPE APP SESA SAGE Traditional HW Sage* SERVICES Elastic SEE: EBB’s + EHAL OL Matrix Cache Advanced HW Elastic Matrix Kittyhawk Ops
  • 65. OUTLINE 1. THE PROBLEM 1. Pay as you go computing 2. Insufficient systems support for elasticity 2. OBSERVATIONS 3. OUR TAKE ON A SOLUTION 4. PROTOTYPE & CHALLENGES
  • 66. Pay as you go hardware Software Consumer Provider
  • 67. Pay as you go hardware Software Consumer Request Provider
  • 68. Pay as you go hardware Software Consumer Provider
  • 69. Elastic Website Load Balancer Consumer Provider
  • 70. Elastic Website Load Balancer Consumer Provider
  • 71. Elastic Website Load Balancer Consumer Provider
  • 72. Other Elastic Applications • Analytics • Batch computation • Stream processing
  • 73. What’s the problem? • Allocation/Boot-time • Programmability
  • 74. Medical Imaging Application • Megapixel image • Quadratic algorithm • (1 mil pixels * 4 bytes/pixel)^2 ~ 14 TB • On Amazon EC2 ~ $8000 per day
  • 78. Distributing an Object Non-Distributed Object Instance Region List Lock L Region List R0 Other R1 Data Structures R2
  • 79. L R0 Root Rep0 L R0 L R1 R1 Rep1 L R2 R0 R2 Rep2
  • 80. Elastic Programmable Interrupt Controller Source Event 1 1 1 0 Action