Apache CloudStack
   Evolution Proposal
Alex Huang
Software Architect, Citrix Systems
A little bit about me

• Cloud.Com Founding Engineer
• Software architect for CloudPlatform
• Responsible for overall
  architecture, performance, and scalability
• Committer and PPMC member
• BS from UC Berkeley and MS from Stanford
Design Goals

• Make it easier for developers to get started
• Allow developers with different skill sets to work on
  different parts of CloudStack
• Give service provider the choice to deploy only parts of
  CloudStack that they want to use
• Allow CloudStack components to be written in
  languages other than Java
• Increase deployment’s availability and maintainability
• Contain fault within a zone
• Allow for zero downtime upgrades
• Testability of different components
Action Plan

• Disaggregate CloudStack services
• Disaggregate CloudStack Orchestration (Cloud-
  Engine)
• Switch to using well-known frameworks
• Allow better composition at the resource layer
• Change the deployment model for better
  resiliency
Disaggregating CloudStack
CloudStack Functional Layers


                                            Presentation

 OAMP API       End User API   AWS API               S3 API

                                       End User Services
                                                                                  Domains &
Accounts/ACL      Policies     Offerings           Templates     Console Proxy
                                                                                   Projects
                                Virtual Resource Management

                               Statistics
     HA            Usage                             Alerts        VM Sync
                               Collection
                                 Data Center Abstraction Layer
                Deployment                                                       Configuration
Orchestration                  Templates              SDN         Snapshots
                 Planning                                                         / Mappings
                               Hardware Resource Management

  Storage       Hypervisor      L2/L3               Network        Object
   Pools         Clusters      Networks             Services       Storage
Pros & Cons
Pros                         Cons
• Easy for a small team to   • Interdependency in these layers
                               causes reliability problems.
  develop in                     – Contracts between layers cannot
                                   be enforced since each layer
• Easy to deploy                   cannot be individually tested.
                             • Developer skill set must range
                               from API design all the way to
                               system level programming to
                               effectively code in CloudStack
                             • CloudStack availability and
                               maintainability suffers because
                               layers with different availability
                               and maintainability requirements
                               are deployed in one process.
Action Plan

      Service                                  Purpose


Cloud-Engine       -   Presents a data center abstraction layer
                   -   Orchestration within the data center abstraction layer
                   -   Provisioning of the physical resources
                   -   Directory for services and service end points
Cloud-Access       - Account and directory connectors
                   - Authentication
                   - ACL & Governess

Cloud-API          - End User API & UI

Cloud-Management   - Management of physical resources
                   - Data Center automation
                   - Admin UI
CloudStack Service Properties

•   Independent life cycle
•   Independent scaling
•   Independent testing
•   REST-ful properties
•   Notification through event systems
•   Individual database (even further in the
    future)
Cloud-Engine vs Cloud-API
Data Center Abstraction API      Cloud API
• Speaks in virtualization       • Speaks in service contracts
  terms (CPU, RAM, etc)             (service offerings, network
• Callers can specify               offerings, disk offerings)
  deployment scope down to       • Callers can only specify
  the host                          deployment destination
• Can be used to deploy             through resource dedication
  service VMs (such as SSVM      • Can only deploy user VMs
  and VR)                        • Contains business logic
• Contains orchestration logic
A Possible Future
                                                           End User Facing Services
                                        Data
                                                       Console               End User
                                                        Proxy                                    AWS API
                                    Export/Import                           VM Mgmt
                                       Service         Service                                   Service
                                                                              Service



                         ACL/Accou                                                                           HA/DR




                                                                                                                        System Administrator Services
                         nt Service                                                                          Service
Customer Care Services




                                                                                  Notification
                           Usage                                                    System                   Stats
                          Service                                                                          Collection
                                                                                                            Service

                           Policy
                         Monitoring                                                                        Resource
                          Service                                                                            Mgmt
                                                                                                            Service
                                                               Cloud-Engine
Disaggregating CloudStack
Orchestration or Cloud-Engine
Why is this important?

• Plugin partners need to clearly see the division
  in functionality between Cloud-Engine and
  their plugin.
• Disaggregating CloudStack Services allow
  developers to quickly add services utilizing
  Cloud-Engine
• Disaggregating Cloud-Engine allows partners
  to add more infrastructure to be utilized in the
  cloud.
Cloud-Engine Components

Component           Purpose
Orchestration       - Orchestration of the Data Center Abstraction Layer
DeploymentPlanner - Plans the deployment destination for virtual machine and
                    volumes
Compute             - Provisioning of the hypervisor
NetworkGuru         - Provides mapping of Network to physical network
NetworkElement      - Provides various network services
PrimaryDataStore    - Provisioning of storage
ImageStore          - Provisioning of templates
BackingStore        - Provisioning of backup storage
SnapshotService     - Provides volume snapshots
MotionService       - Provides data movements between various storage
                      technologies
Cloud-Engine Component Properties

• Recommended to have independent life
  cycles, databases, scaling, and testing.
• Utilize CloudStack’s plugins to bridge
  provisioning needed by Cloud-Engine and
  functionality provided by the component.
• All APIs must be asynchronous.
• Operations are idempotent.
Cloud-Engine Components
                                                     Data Center
                                                     Abstraction
                                                         API
                              Network                                                  Deployment
                             Subsystem                                                  Planning


             Network                                                                              Storage
              Service                                                                            Subsystem
             Providers

                                                                                                                           Storage
                                         Event Bus                      Database                                         (iSCSI, FC,
                                                                                                                         NFS, Local,
                                                                                                                             etc)
                          SDN                                                                        Backup
                                                                                                     Services
 Physical
Network
Elements
                                                                                  Template
                             Compute            Snapshot                            Mgmt
                            Subsystem            Services
                                                                   Notification
                                                                     System

            Hypervisors                                                                                         Object
                                                                                      External                  Store
                                                                                       Event
                                                                                      System
Changing CloudStack’s
 Deployment Model
CloudStack 4.0

           Availability
             Zone 7
                                                                          Region 2
                                                                         Mgmt Server
                                 Region 1                                  Cluster
                                MgmtServer
          Availability            Cluster
            Zone 1                                                Data Center 2




      Data Center 1




                                                                                  Availability
                                                                                    Zone 2       Availability
                                                                                                   Zone 6

    Availability
      Zone 4
                                                                           Data Center 3
                      Availability
                        Zone 5                     Availability
                                                     Zone 3

Data Center 5

                                             Data Center 4
Pros & Cons
Pros                        Cons
• Simple deployment model   • Management plane goes
• Easy to track jobs          down, the entire cloud is
                              not operable.
                            • No fault containment to the
                              availability zone
                            • Unable to do a zone by zone
                              upgrade of CloudStack
                            • Cannot guarantee zero
                              downtime upgrades
New Deployment Model
                                                             VM Users



                                            GSLB
                 Data Center 1                                       Data Center n

            Cloud-API                                        Cloud-API           Cloud-API




      Service
     Provider
     Database

                                  Cloud-                Cloud-
                                  Access                Access
           Cloud-Engine                                                     Cloud-Engine



                                 Account               Account
 Admin                                       DB Sync
                                 Database              Database
Console                                                                                       Admin
                                                                                             Console
            Database                                                          Database
Scalability

• Cloud-API nodes can be brought up and added
  to cluster to handle more requests
• Cloud-Engine cluster and Cloud-API cluster are
  scaled independently
  – Cloud-Engine cluster scaled to hardware resources
  – Cloud-API cluster scaled to incoming requests
Availability

• Cloud-API Servers can be deployed in geographically
  remote locations because they don’t share databases
• One Cloud-API Server going down only impacts the
  tasks it is executing
• Any number of Cloud-API Servers can be brought up
• Cloud-Engine cluster going down means only one zone
  is down. Not the whole cloud.
• Even if the entire Cloud-API cluster is down, admins
  can still manage VMs by directly connecting to the
  Cloud-Engine cluster.
Maintainability

• Zones can be individually upgraded
• Only the zone being upgraded cannot be
  provisioned
• Cloud-API Servers can be brought up with new
  versions and then the old ones shutdown
Use Cases
One Infrastructure Multiple Workloads
                 Cloud VM                   Traditional
                   Users                     VM Users


                                     GSLB


Data Center 1               Data Center 2                   Data Center n

  Cloud-API                                     Cloud-API                    Cloud-API
 (Traditional)                                   (Cloud)                    (Traditional)




                                                            Cloud-Engine
Cloud-Engine                Cloud-Engine
                                                               (Cloud +
 Traditional                   (Cloud)
                                                             Traditional)
Dedicated Entry Points
                                      General                       Customer
                                     VM Users                           A


                              GSLB


Data Center 1        Data Center 2                  Data Center n

                                                                     Cloud-API
  Cloud-API                             Cloud-API                   (Dedicated )




                     Cloud-Engine
Cloud-Engine         (Dedicated to                  Cloud-Engine
                      Customer A)
Hybrid Clouds
                                         General
                                        VM Users
                                                                        Customer
                                                                            A

                                 GSLB


Customer Data Center    Data Center 1                   Data Center n

                                                                         Cloud-API
      Cloud-API           Cloud-API         Cloud-API                   (Dedicated )




    Cloud-Engine        Cloud-Engine                    Cloud-Engine
Milestones

• 12/31
   – New cloud-engine server and deploy VM
         • Alex Huang & Prachi
   – New Storage rearchitecture
         • Edison
   – New IPC mechanism
         • Kelven
• 1/31
   – Completely switched out cloud-api and cloud-management
         • Alex Huang, Rohit
   – Network refactoring
         • Chiradeep
   – API Refactoring
         • Fang, Likitha, Min, Rohit
• 4.2
   – ACL
         • Prachi
The future needs you!
Project web site: http://incubator.apache.org/projects/cloudstack.html

Mailing lists:
cloudstack-dev-subscribe@incubator.apache.org
cloudstack-users-subscribe@incubator.apache.org

IRC: #CloudStack on irc.freenode.net
Thank You!
           Alex Huang
  Email: alex.huang@gmail.com
Blog: http://xueyuan.github.com/

CloudStack Collaboration Conference 12; Refactoring cloud stack

  • 1.
    Apache CloudStack Evolution Proposal Alex Huang Software Architect, Citrix Systems
  • 2.
    A little bitabout me • Cloud.Com Founding Engineer • Software architect for CloudPlatform • Responsible for overall architecture, performance, and scalability • Committer and PPMC member • BS from UC Berkeley and MS from Stanford
  • 3.
    Design Goals • Makeit easier for developers to get started • Allow developers with different skill sets to work on different parts of CloudStack • Give service provider the choice to deploy only parts of CloudStack that they want to use • Allow CloudStack components to be written in languages other than Java • Increase deployment’s availability and maintainability • Contain fault within a zone • Allow for zero downtime upgrades • Testability of different components
  • 4.
    Action Plan • DisaggregateCloudStack services • Disaggregate CloudStack Orchestration (Cloud- Engine) • Switch to using well-known frameworks • Allow better composition at the resource layer • Change the deployment model for better resiliency
  • 5.
  • 6.
    CloudStack Functional Layers Presentation OAMP API End User API AWS API S3 API End User Services Domains & Accounts/ACL Policies Offerings Templates Console Proxy Projects Virtual Resource Management Statistics HA Usage Alerts VM Sync Collection Data Center Abstraction Layer Deployment Configuration Orchestration Templates SDN Snapshots Planning / Mappings Hardware Resource Management Storage Hypervisor L2/L3 Network Object Pools Clusters Networks Services Storage
  • 7.
    Pros & Cons Pros Cons • Easy for a small team to • Interdependency in these layers causes reliability problems. develop in – Contracts between layers cannot be enforced since each layer • Easy to deploy cannot be individually tested. • Developer skill set must range from API design all the way to system level programming to effectively code in CloudStack • CloudStack availability and maintainability suffers because layers with different availability and maintainability requirements are deployed in one process.
  • 8.
    Action Plan Service Purpose Cloud-Engine - Presents a data center abstraction layer - Orchestration within the data center abstraction layer - Provisioning of the physical resources - Directory for services and service end points Cloud-Access - Account and directory connectors - Authentication - ACL & Governess Cloud-API - End User API & UI Cloud-Management - Management of physical resources - Data Center automation - Admin UI
  • 9.
    CloudStack Service Properties • Independent life cycle • Independent scaling • Independent testing • REST-ful properties • Notification through event systems • Individual database (even further in the future)
  • 10.
    Cloud-Engine vs Cloud-API DataCenter Abstraction API Cloud API • Speaks in virtualization • Speaks in service contracts terms (CPU, RAM, etc) (service offerings, network • Callers can specify offerings, disk offerings) deployment scope down to • Callers can only specify the host deployment destination • Can be used to deploy through resource dedication service VMs (such as SSVM • Can only deploy user VMs and VR) • Contains business logic • Contains orchestration logic
  • 11.
    A Possible Future End User Facing Services Data Console End User Proxy AWS API Export/Import VM Mgmt Service Service Service Service ACL/Accou HA/DR System Administrator Services nt Service Service Customer Care Services Notification Usage System Stats Service Collection Service Policy Monitoring Resource Service Mgmt Service Cloud-Engine
  • 12.
  • 13.
    Why is thisimportant? • Plugin partners need to clearly see the division in functionality between Cloud-Engine and their plugin. • Disaggregating CloudStack Services allow developers to quickly add services utilizing Cloud-Engine • Disaggregating Cloud-Engine allows partners to add more infrastructure to be utilized in the cloud.
  • 14.
    Cloud-Engine Components Component Purpose Orchestration - Orchestration of the Data Center Abstraction Layer DeploymentPlanner - Plans the deployment destination for virtual machine and volumes Compute - Provisioning of the hypervisor NetworkGuru - Provides mapping of Network to physical network NetworkElement - Provides various network services PrimaryDataStore - Provisioning of storage ImageStore - Provisioning of templates BackingStore - Provisioning of backup storage SnapshotService - Provides volume snapshots MotionService - Provides data movements between various storage technologies
  • 15.
    Cloud-Engine Component Properties •Recommended to have independent life cycles, databases, scaling, and testing. • Utilize CloudStack’s plugins to bridge provisioning needed by Cloud-Engine and functionality provided by the component. • All APIs must be asynchronous. • Operations are idempotent.
  • 16.
    Cloud-Engine Components Data Center Abstraction API Network Deployment Subsystem Planning Network Storage Service Subsystem Providers Storage Event Bus Database (iSCSI, FC, NFS, Local, etc) SDN Backup Services Physical Network Elements Template Compute Snapshot Mgmt Subsystem Services Notification System Hypervisors Object External Store Event System
  • 17.
  • 18.
    CloudStack 4.0 Availability Zone 7 Region 2 Mgmt Server Region 1 Cluster MgmtServer Availability Cluster Zone 1 Data Center 2 Data Center 1 Availability Zone 2 Availability Zone 6 Availability Zone 4 Data Center 3 Availability Zone 5 Availability Zone 3 Data Center 5 Data Center 4
  • 19.
    Pros & Cons Pros Cons • Simple deployment model • Management plane goes • Easy to track jobs down, the entire cloud is not operable. • No fault containment to the availability zone • Unable to do a zone by zone upgrade of CloudStack • Cannot guarantee zero downtime upgrades
  • 20.
    New Deployment Model VM Users GSLB Data Center 1 Data Center n Cloud-API Cloud-API Cloud-API Service Provider Database Cloud- Cloud- Access Access Cloud-Engine Cloud-Engine Account Account Admin DB Sync Database Database Console Admin Console Database Database
  • 21.
    Scalability • Cloud-API nodescan be brought up and added to cluster to handle more requests • Cloud-Engine cluster and Cloud-API cluster are scaled independently – Cloud-Engine cluster scaled to hardware resources – Cloud-API cluster scaled to incoming requests
  • 22.
    Availability • Cloud-API Serverscan be deployed in geographically remote locations because they don’t share databases • One Cloud-API Server going down only impacts the tasks it is executing • Any number of Cloud-API Servers can be brought up • Cloud-Engine cluster going down means only one zone is down. Not the whole cloud. • Even if the entire Cloud-API cluster is down, admins can still manage VMs by directly connecting to the Cloud-Engine cluster.
  • 23.
    Maintainability • Zones canbe individually upgraded • Only the zone being upgraded cannot be provisioned • Cloud-API Servers can be brought up with new versions and then the old ones shutdown
  • 24.
  • 25.
    One Infrastructure MultipleWorkloads Cloud VM Traditional Users VM Users GSLB Data Center 1 Data Center 2 Data Center n Cloud-API Cloud-API Cloud-API (Traditional) (Cloud) (Traditional) Cloud-Engine Cloud-Engine Cloud-Engine (Cloud + Traditional (Cloud) Traditional)
  • 26.
    Dedicated Entry Points General Customer VM Users A GSLB Data Center 1 Data Center 2 Data Center n Cloud-API Cloud-API Cloud-API (Dedicated ) Cloud-Engine Cloud-Engine (Dedicated to Cloud-Engine Customer A)
  • 27.
    Hybrid Clouds General VM Users Customer A GSLB Customer Data Center Data Center 1 Data Center n Cloud-API Cloud-API Cloud-API Cloud-API (Dedicated ) Cloud-Engine Cloud-Engine Cloud-Engine
  • 28.
    Milestones • 12/31 – New cloud-engine server and deploy VM • Alex Huang & Prachi – New Storage rearchitecture • Edison – New IPC mechanism • Kelven • 1/31 – Completely switched out cloud-api and cloud-management • Alex Huang, Rohit – Network refactoring • Chiradeep – API Refactoring • Fang, Likitha, Min, Rohit • 4.2 – ACL • Prachi
  • 29.
    The future needsyou! Project web site: http://incubator.apache.org/projects/cloudstack.html Mailing lists: cloudstack-dev-subscribe@incubator.apache.org cloudstack-users-subscribe@incubator.apache.org IRC: #CloudStack on irc.freenode.net
  • 30.
    Thank You! Alex Huang Email: alex.huang@gmail.com Blog: http://xueyuan.github.com/