Under the Hood:  Inside The Cloud Computing Hosting Environment<br />ES19<br />Erick Smith<br />Development Manager<br />M...
Introduce the fabric controller<br />Introduce the service model<br />Give some insight into how it all works<br />Describ...
Resource allocation<br />Machines must be chosen to host roles of the service<br />Fault domains, update domains, resource...
Windows Azure Fabric Controller<br />VM<br />Control VM<br />VM<br />VM<br />WS08 Hypervisor<br />Service Roles<br />Contr...
Windows Azure Automation<br />Fabric Controller<br />“What” is needed<br />Fabric Controller (FC) <br />Maps declarative s...
Owns all the data center hardware<br />Uses the inventory to host services<br />Similar to what a per machine operating sy...
Modeling Services<br />Public Internet<br />Template automatically maps to service model<br />Background <br />Process Rol...
The topology of your service<br />The roles and how they are connected<br />Attributes of the various components<br />Oper...
Allows you to specify what portion of your service can be offline at a time<br />Fault domains are based on the topology o...
Purpose:  Communicate settings to service roles<br />There is no “registry” for services<br />Application configuration se...
Windows Azure Service LifecycleGoal is to automate life cycle as much as possible<br />Automated<br />Automated<br />Devel...
Resource allocation<br />Nodes are chosen based on constraints encoded in the service model<br />Fault domains, update dom...
Primary goal – find a home for all role instances<br />Essentially a constraint satisfaction problem<br />Allocate instanc...
Key FC Data Structures<br />Logical Node<br />Logical Role Instance<br />Logical Role<br />Logical Service<br />Role Insta...
Maintaining Node State<br />Logical Node<br />Logical Role Instance<br />Goal State<br />Current State<br />Physical Node<...
FC maintains a state machine for each node<br />Various events cause node to move into a new state<br />FC maintains a cac...
Virtual IPs (VIPs) are allocated from a pool<br />Load balancer (LB) setup<br />VIPs and dedicated IP (DIP) pools are prog...
Windows Azure FC monitors the health of roles<br />FC detects if a role dies<br />A role can indicate it is unhealthy<br /...
FC can upgrade a running service<br />Resources deployed to all nodes in parallel<br />Done by updating one “update domain...
Windows Azure provisions and monitors hardware elements<br />Compute nodes, TOR/L2 switches, LBs, access routers, and node...
Your services are isolated from other services<br />Can access resources declared in model only<br />Local node resources ...
FC is a cluster of 5-7 replicas<br />Replicated state with automatic failover<br />New primary picks up seamlessly from fa...
Network has redundancy built in<br />Redundant switches, load balancers, and access routers <br />Services are deployed ac...
PDC release<br />Automated service deployment<br />Three service templates<br />Support for changing number of running ins...
Windows Azure automates most functions<br />System takes care of running and keeping services up<br />Service owner in con...
Virtualization And Deployment<br />
Multi-tenancy with security and isolation<br />Improved ‘performance/watt/$’ ratio<br />Increased operations automation<br...
High-Level Architecture<br />Guest OS<br />Server Enterprise<br />Guest OS<br />Server Enterprise<br />Host OS<br />Server...
Images are virtual hard disks (VHDs)<br />Offline construction and servicing of images<br />Separate operating system and ...
Image-Based Deployment<br />Maintenance OS<br />Host Partition<br />Guest Partition<br />Guest Partition<br />Guest Partit...
Deployment of images is just file copy<br />No installation<br />Background process<br />Multicast<br />Image caching for ...
Tech Preview offers one virtual machine type<br />Platform:  64-bit Windows Server 2008<br />CPU:  1.5-1.7 GHz x64 equival...
Hypervisor<br />Efficient:  Exploit latest processor virtualization features (e.g., SLAT, large pages)<br />Scalable:  NUM...
Expensive<br />SLAT requires less hypervisor intervention associated with shadow page tables (SPT)<br />Allow more CPU cyc...
The system is divided into small groups of processors (NUMA nodes)<br />Each node has dedicated memory (local) <br />Nodes...
NUMA Support<br />
NUMA-aware for virtual machine scalability<br />Hypervisor schedules resources to improve performance characteristics<br /...
NUMA-Aware Scheduler<br />
Scheduler<br />Tuned for datacenter workloads (ASP.NET, etc.)<br />More predictability and fairness<br />Tolerate heavy I/...
Automated, reliable deployment<br />Streamlined and consistent<br />Verifiable through offline provisioning<br />Efficient...
Related PDC sessions<br />A Lap Around Cloud Services<br />Architecting Services For The Cloud<br />Cloud Computing: Progr...
Evals & Recordings<br />Please fill out your evaluation for this session at:<br />This session will be available as a reco...
Please use the microphones provided<br />Q&A<br />
© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be...
ES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent
Upcoming SlideShare
Loading in …5
×

ES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent

1,616 views

Published on

  • Be the first to comment

  • Be the first to like this

ES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent

  1. 1. Under the Hood: Inside The Cloud Computing Hosting Environment<br />ES19<br />Erick Smith<br />Development Manager<br />Microsoft Corporation<br />Chuck Lenzmeier<br />Architect<br />Microsoft Corporation<br />
  2. 2. Introduce the fabric controller<br />Introduce the service model<br />Give some insight into how it all works<br />Describe the workings at the data center level<br />Then zoom in to a single machine<br />Purpose Of This Talk/Agenda<br />
  3. 3. Resource allocation<br />Machines must be chosen to host roles of the service<br />Fault domains, update domains, resource utilization, hosting environment, etc.<br />Procure additional hardware if necessary<br />IP addresses must be acquired<br />Provisioning<br />Machines must be setup<br />Virtual machines created<br />Applications configured<br />DNS setup<br />Load balancers must be programmed<br />Upgrades<br />Locate appropriate machines<br />Update the software/settings as necessary<br />Only bring down a subset of the service at a time<br />Maintaining service health<br />Software faults must be handled<br />Hardware failures will occur<br />Logging infrastructure is provided to diagnose issues<br />This is ongoing work…you’re never done<br />Deploying A Service Manually<br />
  4. 4. Windows Azure Fabric Controller<br />VM<br />Control VM<br />VM<br />VM<br />WS08 Hypervisor<br />Service Roles<br />Control <br />Agent<br />Out-of-band communication – hardware control<br />WS08<br />In-band communication – software control<br />Load-balancers<br />Node can be a VM or a physical machine<br />Switches<br />Highly-available<br />Fabric Controller<br />
  5. 5. Windows Azure Automation<br />Fabric Controller<br />“What” is needed<br />Fabric Controller (FC) <br />Maps declarative service specifications to available resources<br />Manages service life cycle starting from bare metal<br />Maintains system health and satisfies SLA<br />What’s special about it<br />Model-driven service management <br />Enables utility-model shared fabric<br />Automates hardware management<br />Make it happen<br />Fabric<br />Switches<br />Load-balancers<br />
  6. 6. Owns all the data center hardware<br />Uses the inventory to host services<br />Similar to what a per machine operating system does with applications<br />The FC provisions the hardware as necessary<br />Maintains the health of the hardware<br />Deploys applications to free resources<br />Maintains the health of those applications<br />Fabric Controller<br />
  7. 7. Modeling Services<br />Public Internet<br />Template automatically maps to service model<br />Background <br />Process Role<br />Front-end<br />Web Role<br />Load<br /> Balancer<br />Fundamental Services<br />Load Balancer Channel<br />Endpoint<br />Interface<br />Directory Resource<br />
  8. 8. The topology of your service<br />The roles and how they are connected<br />Attributes of the various components<br />Operating system features required<br />Configuration settings<br />Describe exposed interfaces<br />Required characteristics<br />How many fault/update domains you need<br />How many instances of each role<br />What You Describe In Your Service Model…<br />
  9. 9. Allows you to specify what portion of your service can be offline at a time<br />Fault domains are based on the topology of the data center<br />Switch failure<br />Statistical in nature<br />Update domains are determined by what percentage of your service you will take out at a time for an upgrade<br />You may experience outages for both at the same time<br />System considers fault domains when allocating service roles<br />Example: Don’t put all roles in same rack<br />System considers update domains when upgrading a service<br />Fault/Update Domains<br />Fault domains<br />Allocation is across fault domains<br />
  10. 10. Purpose: Communicate settings to service roles<br />There is no “registry” for services<br />Application configuration settings<br />Declared by developer<br />Set by deployer<br />System configuration settings<br />Pre-declared, same kinds for all roles<br />Instance ID, fault domain ID, update domain ID<br />Assigned by the system<br />In both cases, settings accessible at run time<br />Via call-backs when values change<br />Dynamic Configuration Settings<br />
  11. 11. Windows Azure Service LifecycleGoal is to automate life cycle as much as possible<br />Automated<br />Automated<br />Developer/<br />Deployer<br />Developer<br />
  12. 12. Resource allocation<br />Nodes are chosen based on constraints encoded in the service model<br />Fault domains, update domains, resource utilization, hosting environment, etc.<br />VIPs/LBs are reserved for each external interface described in the model<br />Provisioning<br />Allocated hardware is assigned a new goal state<br />FC drives hardware into goal state<br />Upgrades<br />FC can upgrade a running service<br />Maintaining service health<br />Software faults must be handled<br />Hardware failures will occur<br />Logging infrastructure is provided to diagnose issues<br />Lifecycle Of A Windows Azure Service<br />
  13. 13. Primary goal – find a home for all role instances<br />Essentially a constraint satisfaction problem<br />Allocate instances across “fault domains”<br />Example constraints include<br />Only roles from a single service can be assigned to a node<br />Only a single instance of a role can be assigned to a node<br />Node must contain a compatible hosting environment<br />Node must have enough resources remaining<br />Service model allows for simple hints as to the resources the role will utilize<br />Node must be in the correct fault domain<br />Nodes should only be considered if healthy<br />A machine can be sub-partitioned into VMs<br />Performed as a transaction<br />Resources Come From Our Shared Pool<br />
  14. 14. Key FC Data Structures<br />Logical Node<br />Logical Role Instance<br />Logical Role<br />Logical Service<br />Role Instance Description<br /> Role Description<br />Physical Node<br />Service Description<br />
  15. 15. Maintaining Node State<br />Logical Node<br />Logical Role Instance<br />Goal State<br />Current State<br />Physical Node<br />
  16. 16. FC maintains a state machine for each node<br />Various events cause node to move into a new state<br />FC maintains a cache about the state it believes each node to be in<br />State reconciled with true node state via communication with agent<br />Goal state derived based on assigned role instances<br />On a heartbeat event the FC tries to move the node closer to its goal state (if it isn’t already there)<br />FC tracks when goal state is reached<br />Certain events clear the “in goal state” flag<br />The FC Provisions Machines…<br />
  17. 17. Virtual IPs (VIPs) are allocated from a pool<br />Load balancer (LB) setup<br />VIPs and dedicated IP (DIP) pools are programmed automatically<br />Dips are marked in/out of service as the FCs belief about state of role instances change<br />LB probing is set up to communicate with agent on node which has real time info on health of role<br />Traffic is only routed to roles ready to accept traffic<br />Routing information is sent to agent to configure routes based on network configuration<br />Redundant network gear is in place for high availability<br />…And Other Data Center Resources<br />
  18. 18. Windows Azure FC monitors the health of roles<br />FC detects if a role dies<br />A role can indicate it is unhealthy<br />Upon learning a role is unhealthy<br />Current state of the node is updated appropriately<br />State machine kicks in again to drive us back into goals state<br />Windows Azure FC monitors the health of the host<br />If the node goes offline, FC will try to recover it<br />If a failed node can’t be recovered, FC migrates role instances to a new node<br />A suitable replacement location is found<br />Existing role instances are notified of the configuration change<br />The FC Keeps Your Service Running<br />
  19. 19. FC can upgrade a running service<br />Resources deployed to all nodes in parallel<br />Done by updating one “update domain” at a time<br />Update domains are logical and don’t need to be tied to a fault domain<br />Goal state for a given node is updated when the appropriate update domain is reached<br />Two modes of operation<br />Manual<br />Automatic<br />Rollbacks are achieved with the same basic mechanism<br />How Upgrades Are Handled<br />
  20. 20. Windows Azure provisions and monitors hardware elements<br />Compute nodes, TOR/L2 switches, LBs, access routers, and node OOB control elements<br />Hardware life cycle management<br />Burn-in tests, diagnostics, and repair<br />Failed hardware taken out of pool<br />Application of automatic diagnostics<br />Physical replacement of failed hardware<br />Capacity planning<br />On-going node and network utilization measurements<br />Proven process for bringing new hardware capacity online<br />Behind The Scenes Work<br />
  21. 21. Your services are isolated from other services<br />Can access resources declared in model only<br />Local node resources – temp storage<br />Network end-points<br />Isolation using multiple mechanisms<br />Automatic application of windows security patches<br />Rolling operating system image upgrades<br />Service Isolation And Security<br />
  22. 22. FC is a cluster of 5-7 replicas<br />Replicated state with automatic failover<br />New primary picks up seamlessly from failed replica<br />Even if all FC replicas are down, services continue to function<br />Rolling upgrade support of FC itself<br />FC cluster is modeled and controlled by a utility “root” FC<br />Windows Azure FC Is Highly Available<br />Client Node<br />FC Agent<br />FC Core<br />FC Core<br />FC Core<br />Object Model<br />Object Model<br />Object Model<br />Primary FC Node<br />Secondary FC Node<br />Secondary FC Node<br />Uncommitted<br />Committed<br />Committed<br />Committed<br />Disk<br />Disk<br />Disk<br />Replication system<br />
  23. 23. Network has redundancy built in<br />Redundant switches, load balancers, and access routers <br />Services are deployed across fault domains<br />Load balancers route traffic to active nodes only<br />Windows Azure FC state check-pointed periodically<br />Can roll-back to previous checkpoints<br />Guards against corrupted FC state, loss of all replicated state, operator errors<br />FC state is stored on multiple replicas across fault domains<br />Windows Azure Fabric Is Highly Available<br />
  24. 24. PDC release<br />Automated service deployment<br />Three service templates<br />Support for changing number of running instances<br />Simple service upgrades/downgrades<br />Automated service failure discovery and recovery<br />External VIP address/DNS name per service<br />Service network isolation enforcement <br />Automated hardware management<br />Include automated network load-balancer management<br />For 2009<br />Ability to model more complex applications<br />Richer service life-cycle management <br />Richer network management<br />Service Life-cycle<br />
  25. 25. Windows Azure automates most functions<br />System takes care of running and keeping services up<br />Service owner in control<br />Self-management model through portal<br />Secure and highly-available platform<br />Built-in data center management<br />Capacity planning<br />Hardware and network management<br />Summary<br />
  26. 26. Virtualization And Deployment<br />
  27. 27. Multi-tenancy with security and isolation<br />Improved ‘performance/watt/$’ ratio<br />Increased operations automation<br />Hypervisor-based virtualization<br />Highly efficient and scalable<br />Leverages hardware advances<br />Virtual Computing Environment<br />
  28. 28. High-Level Architecture<br />Guest OS<br />Server Enterprise<br />Guest OS<br />Server Enterprise<br />Host OS<br />Server Core<br />Applications<br />Applications<br />VirtualizationStack<br />(VSC)<br />VirtualizationStack<br />(VSC)<br />VirtualizationStack<br />(VSP)<br />Drivers<br />Hypervisor<br />GuestPartition<br />Host Partition<br />GuestPartition<br />VMBUS<br />VMBUS<br />VMBUS<br />Hardware<br />CPU<br />NIC<br />Disk1<br />Disk2<br />
  29. 29. Images are virtual hard disks (VHDs)<br />Offline construction and servicing of images<br />Separate operating system and service images<br />Same deployment model for root partition<br />Image-Based Deployment<br />
  30. 30. Image-Based Deployment<br />Maintenance OS<br />Host Partition<br />Guest Partition<br />Guest Partition<br />Guest Partition<br />Application VHD<br />Application VHD<br />Application VHD<br />App1 Package<br />App3 Package<br />App2 Package<br />Host partition differencing VHD<br />Guest partition differencing VHD<br />Guest partition differencing VHD<br />Guest partition differencing VHD<br />HV-enabled Server Core base VHD<br />Server Enterprise base VHD<br />Server Core base VHD<br />Server Enterprise base VHD<br />
  31. 31. Deployment of images is just file copy<br />No installation<br />Background process<br />Multicast<br />Image caching for quick update and rollback<br />Servicing is an offline process<br />Dynamic allocation based on business needs<br />Net: High availability at lower cost<br />Rapid And Reliable Provisioning<br />
  32. 32. Tech Preview offers one virtual machine type<br />Platform: 64-bit Windows Server 2008<br />CPU: 1.5-1.7 GHz x64 equivalent<br />Memory: 1.7 GB<br />Network: 100 Mbps<br />Transient local storage: 250 GB<br />Windows azure storage also available: 50 GB<br />Full service model supports more virtual machine types<br />Expect to see more options post-PDC<br />Windows Azure Compute Instance<br />
  33. 33. Hypervisor<br />Efficient: Exploit latest processor virtualization features (e.g., SLAT, large pages)<br />Scalable: NUMA-aware for scalability<br />Small: Take up little resources<br />Host/guest operating system<br />Window Server 2008 compatible<br />Optimized for virtualized environment<br />I/O performance equally shared between virtual machines<br /> Windows Azure Virtualization<br />
  34. 34. Expensive<br />SLAT requires less hypervisor intervention associated with shadow page tables (SPT)<br />Allow more CPU cycles to be spent on real work<br />Release memory allocated for SPT<br />SLAT supports large page size (2MB and 1GB)<br />Second-Level Address Translation<br />
  35. 35. The system is divided into small groups of processors (NUMA nodes)<br />Each node has dedicated memory (local) <br />Nodes can access memory residing in other nodes (remote), but with extra latency<br />NUMA Support<br />
  36. 36. NUMA Support<br />
  37. 37. NUMA-aware for virtual machine scalability<br />Hypervisor schedules resources to improve performance characteristics<br />Assign “near” memory to virtual machine <br />Select “near” logical processor for virtual processor<br />NUMA Scalability<br />
  38. 38. NUMA-Aware Scheduler<br />
  39. 39. Scheduler<br />Tuned for datacenter workloads (ASP.NET, etc.)<br />More predictability and fairness<br />Tolerate heavy I/O loads<br />Intercept reduction<br />Spin lock enlightenments<br />Reduce TLB flushes<br />VMBUS bandwidth improvement<br />More Hypervisor Optimizations<br />
  40. 40. Automated, reliable deployment<br />Streamlined and consistent<br />Verifiable through offline provisioning<br />Efficient, scalable hypervisor<br />Maximizing CPU cycles on customer applications <br />Optimized for datacenter workload<br />Reliable and secure virtualization<br />Compute instances are isolated from each other<br />Predictable and consistent behavior<br />Summary<br />
  41. 41. Related PDC sessions<br />A Lap Around Cloud Services<br />Architecting Services For The Cloud<br />Cloud Computing: Programming In The Cloud<br />Related PDC labs<br />Windows Azure Hands-on Labs<br />Windows Azure Lounge<br />Web site http://www.azure.com/windows<br />Related Content<br />
  42. 42. Evals & Recordings<br />Please fill out your evaluation for this session at:<br />This session will be available as a recording at:<br />www.microsoftpdc.com<br />
  43. 43. Please use the microphones provided<br />Q&A<br />
  44. 44. © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.<br />The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.<br />

×