Apache CloudStack Evolution ProposalAlex HuangSoftware Architect, Citrix Systems
A little bit about me• Cloud.Com Founding Engineer• Software architect for CloudPlatform• Responsible for overall architecture, performance, and scalability• Committer and PPMC member• BS from UC Berkeley and MS from Stanford
Design Goals• Make it easier for developers to get started• Allow developers with different skill sets to work on different parts of CloudStack• Give service provider the choice to deploy only parts of CloudStack that they want to use• Allow CloudStack components to be written in languages other than Java• Increase deployment’s availability and maintainability• Contain fault within a zone• Allow for zero downtime upgrades• Testability of different components
Action Plan• Disaggregate CloudStack services• Disaggregate CloudStack Orchestration (Cloud- Engine)• Switch to using well-known frameworks• Allow better composition at the resource layer• Change the deployment model for better resiliency
CloudStack Functional Layers Presentation OAMP API End User API AWS API S3 API End User Services Domains &Accounts/ACL Policies Offerings Templates Console Proxy Projects Virtual Resource Management Statistics HA Usage Alerts VM Sync Collection Data Center Abstraction Layer Deployment ConfigurationOrchestration Templates SDN Snapshots Planning / Mappings Hardware Resource Management Storage Hypervisor L2/L3 Network Object Pools Clusters Networks Services Storage
Pros & ConsPros Cons• Easy for a small team to • Interdependency in these layers causes reliability problems. develop in – Contracts between layers cannot be enforced since each layer• Easy to deploy cannot be individually tested. • Developer skill set must range from API design all the way to system level programming to effectively code in CloudStack • CloudStack availability and maintainability suffers because layers with different availability and maintainability requirements are deployed in one process.
Action Plan Service PurposeCloud-Engine - Presents a data center abstraction layer - Orchestration within the data center abstraction layer - Provisioning of the physical resources - Directory for services and service end pointsCloud-Access - Account and directory connectors - Authentication - ACL & GovernessCloud-API - End User API & UICloud-Management - Management of physical resources - Data Center automation - Admin UI
CloudStack Service Properties• Independent life cycle• Independent scaling• Independent testing• REST-ful properties• Notification through event systems• Individual database (even further in the future)
Cloud-Engine vs Cloud-APIData Center Abstraction API Cloud API• Speaks in virtualization • Speaks in service contracts terms (CPU, RAM, etc) (service offerings, network• Callers can specify offerings, disk offerings) deployment scope down to • Callers can only specify the host deployment destination• Can be used to deploy through resource dedication service VMs (such as SSVM • Can only deploy user VMs and VR) • Contains business logic• Contains orchestration logic
A Possible Future End User Facing Services Data Console End User Proxy AWS API Export/Import VM Mgmt Service Service Service Service ACL/Accou HA/DR System Administrator Services nt Service ServiceCustomer Care Services Notification Usage System Stats Service Collection Service Policy Monitoring Resource Service Mgmt Service Cloud-Engine
Disaggregating CloudStackOrchestration or Cloud-Engine
Why is this important?• Plugin partners need to clearly see the division in functionality between Cloud-Engine and their plugin.• Disaggregating CloudStack Services allow developers to quickly add services utilizing Cloud-Engine• Disaggregating Cloud-Engine allows partners to add more infrastructure to be utilized in the cloud.
Cloud-Engine ComponentsComponent PurposeOrchestration - Orchestration of the Data Center Abstraction LayerDeploymentPlanner - Plans the deployment destination for virtual machine and volumesCompute - Provisioning of the hypervisorNetworkGuru - Provides mapping of Network to physical networkNetworkElement - Provides various network servicesPrimaryDataStore - Provisioning of storageImageStore - Provisioning of templatesBackingStore - Provisioning of backup storageSnapshotService - Provides volume snapshotsMotionService - Provides data movements between various storage technologies
Cloud-Engine Component Properties• Recommended to have independent life cycles, databases, scaling, and testing.• Utilize CloudStack’s plugins to bridge provisioning needed by Cloud-Engine and functionality provided by the component.• All APIs must be asynchronous.• Operations are idempotent.
Cloud-Engine Components Data Center Abstraction API Network Deployment Subsystem Planning Network Storage Service Subsystem Providers Storage Event Bus Database (iSCSI, FC, NFS, Local, etc) SDN Backup Services PhysicalNetworkElements Template Compute Snapshot Mgmt Subsystem Services Notification System Hypervisors Object External Store Event System
CloudStack 4.0 Availability Zone 7 Region 2 Mgmt Server Region 1 Cluster MgmtServer Availability Cluster Zone 1 Data Center 2 Data Center 1 Availability Zone 2 Availability Zone 6 Availability Zone 4 Data Center 3 Availability Zone 5 Availability Zone 3Data Center 5 Data Center 4
Pros & ConsPros Cons• Simple deployment model • Management plane goes• Easy to track jobs down, the entire cloud is not operable. • No fault containment to the availability zone • Unable to do a zone by zone upgrade of CloudStack • Cannot guarantee zero downtime upgrades
New Deployment Model VM Users GSLB Data Center 1 Data Center n Cloud-API Cloud-API Cloud-API Service Provider Database Cloud- Cloud- Access Access Cloud-Engine Cloud-Engine Account Account Admin DB Sync Database DatabaseConsole Admin Console Database Database
Scalability• Cloud-API nodes can be brought up and added to cluster to handle more requests• Cloud-Engine cluster and Cloud-API cluster are scaled independently – Cloud-Engine cluster scaled to hardware resources – Cloud-API cluster scaled to incoming requests
Availability• Cloud-API Servers can be deployed in geographically remote locations because they don’t share databases• One Cloud-API Server going down only impacts the tasks it is executing• Any number of Cloud-API Servers can be brought up• Cloud-Engine cluster going down means only one zone is down. Not the whole cloud.• Even if the entire Cloud-API cluster is down, admins can still manage VMs by directly connecting to the Cloud-Engine cluster.
Maintainability• Zones can be individually upgraded• Only the zone being upgraded cannot be provisioned• Cloud-API Servers can be brought up with new versions and then the old ones shutdown
One Infrastructure Multiple Workloads Cloud VM Traditional Users VM Users GSLBData Center 1 Data Center 2 Data Center n Cloud-API Cloud-API Cloud-API (Traditional) (Cloud) (Traditional) Cloud-EngineCloud-Engine Cloud-Engine (Cloud + Traditional (Cloud) Traditional)
Dedicated Entry Points General Customer VM Users A GSLBData Center 1 Data Center 2 Data Center n Cloud-API Cloud-API Cloud-API (Dedicated ) Cloud-EngineCloud-Engine (Dedicated to Cloud-Engine Customer A)
Hybrid Clouds General VM Users Customer A GSLBCustomer Data Center Data Center 1 Data Center n Cloud-API Cloud-API Cloud-API Cloud-API (Dedicated ) Cloud-Engine Cloud-Engine Cloud-Engine
Milestones• 12/31 – New cloud-engine server and deploy VM • Alex Huang & Prachi – New Storage rearchitecture • Edison – New IPC mechanism • Kelven• 1/31 – Completely switched out cloud-api and cloud-management • Alex Huang, Rohit – Network refactoring • Chiradeep – API Refactoring • Fang, Likitha, Min, Rohit• 4.2 – ACL • Prachi
The future needs you!Project web site: http://incubator.apache.org/projects/cloudstack.htmlMailing lists:firstname.lastname@example.org@incubator.apache.orgIRC: #CloudStack on irc.freenode.net
Thank You! Alex Huang Email: email@example.comBlog: http://xueyuan.github.com/