• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
ES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent

ES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent






Total Views
Views on SlideShare
Embed Views



1 Embed 1

http://www.slideshare.net 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    ES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent ES19 – Under the Hood: Inside the Cloud Computing Hosting Environmnent Presentation Transcript

    • Under the Hood: Inside The Cloud Computing Hosting Environment
      Erick Smith
      Development Manager
      Microsoft Corporation
      Chuck Lenzmeier
      Microsoft Corporation
    • Introduce the fabric controller
      Introduce the service model
      Give some insight into how it all works
      Describe the workings at the data center level
      Then zoom in to a single machine
      Purpose Of This Talk/Agenda
    • Resource allocation
      Machines must be chosen to host roles of the service
      Fault domains, update domains, resource utilization, hosting environment, etc.
      Procure additional hardware if necessary
      IP addresses must be acquired
      Machines must be setup
      Virtual machines created
      Applications configured
      DNS setup
      Load balancers must be programmed
      Locate appropriate machines
      Update the software/settings as necessary
      Only bring down a subset of the service at a time
      Maintaining service health
      Software faults must be handled
      Hardware failures will occur
      Logging infrastructure is provided to diagnose issues
      This is ongoing work…you’re never done
      Deploying A Service Manually
    • Windows Azure Fabric Controller
      Control VM
      WS08 Hypervisor
      Service Roles
      Out-of-band communication – hardware control
      In-band communication – software control
      Node can be a VM or a physical machine
      Fabric Controller
    • Windows Azure Automation
      Fabric Controller
      “What” is needed
      Fabric Controller (FC)
      Maps declarative service specifications to available resources
      Manages service life cycle starting from bare metal
      Maintains system health and satisfies SLA
      What’s special about it
      Model-driven service management
      Enables utility-model shared fabric
      Automates hardware management
      Make it happen
    • Owns all the data center hardware
      Uses the inventory to host services
      Similar to what a per machine operating system does with applications
      The FC provisions the hardware as necessary
      Maintains the health of the hardware
      Deploys applications to free resources
      Maintains the health of those applications
      Fabric Controller
    • Modeling Services
      Public Internet
      Template automatically maps to service model
      Process Role
      Web Role
      Fundamental Services
      Load Balancer Channel
      Directory Resource
    • The topology of your service
      The roles and how they are connected
      Attributes of the various components
      Operating system features required
      Configuration settings
      Describe exposed interfaces
      Required characteristics
      How many fault/update domains you need
      How many instances of each role
      What You Describe In Your Service Model…
    • Allows you to specify what portion of your service can be offline at a time
      Fault domains are based on the topology of the data center
      Switch failure
      Statistical in nature
      Update domains are determined by what percentage of your service you will take out at a time for an upgrade
      You may experience outages for both at the same time
      System considers fault domains when allocating service roles
      Example: Don’t put all roles in same rack
      System considers update domains when upgrading a service
      Fault/Update Domains
      Fault domains
      Allocation is across fault domains
    • Purpose: Communicate settings to service roles
      There is no “registry” for services
      Application configuration settings
      Declared by developer
      Set by deployer
      System configuration settings
      Pre-declared, same kinds for all roles
      Instance ID, fault domain ID, update domain ID
      Assigned by the system
      In both cases, settings accessible at run time
      Via call-backs when values change
      Dynamic Configuration Settings
    • Windows Azure Service LifecycleGoal is to automate life cycle as much as possible
    • Resource allocation
      Nodes are chosen based on constraints encoded in the service model
      Fault domains, update domains, resource utilization, hosting environment, etc.
      VIPs/LBs are reserved for each external interface described in the model
      Allocated hardware is assigned a new goal state
      FC drives hardware into goal state
      FC can upgrade a running service
      Maintaining service health
      Software faults must be handled
      Hardware failures will occur
      Logging infrastructure is provided to diagnose issues
      Lifecycle Of A Windows Azure Service
    • Primary goal – find a home for all role instances
      Essentially a constraint satisfaction problem
      Allocate instances across “fault domains”
      Example constraints include
      Only roles from a single service can be assigned to a node
      Only a single instance of a role can be assigned to a node
      Node must contain a compatible hosting environment
      Node must have enough resources remaining
      Service model allows for simple hints as to the resources the role will utilize
      Node must be in the correct fault domain
      Nodes should only be considered if healthy
      A machine can be sub-partitioned into VMs
      Performed as a transaction
      Resources Come From Our Shared Pool
    • Key FC Data Structures
      Logical Node
      Logical Role Instance
      Logical Role
      Logical Service
      Role Instance Description
      Role Description
      Physical Node
      Service Description
    • Maintaining Node State
      Logical Node
      Logical Role Instance
      Goal State
      Current State
      Physical Node
    • FC maintains a state machine for each node
      Various events cause node to move into a new state
      FC maintains a cache about the state it believes each node to be in
      State reconciled with true node state via communication with agent
      Goal state derived based on assigned role instances
      On a heartbeat event the FC tries to move the node closer to its goal state (if it isn’t already there)
      FC tracks when goal state is reached
      Certain events clear the “in goal state” flag
      The FC Provisions Machines…
    • Virtual IPs (VIPs) are allocated from a pool
      Load balancer (LB) setup
      VIPs and dedicated IP (DIP) pools are programmed automatically
      Dips are marked in/out of service as the FCs belief about state of role instances change
      LB probing is set up to communicate with agent on node which has real time info on health of role
      Traffic is only routed to roles ready to accept traffic
      Routing information is sent to agent to configure routes based on network configuration
      Redundant network gear is in place for high availability
      …And Other Data Center Resources
    • Windows Azure FC monitors the health of roles
      FC detects if a role dies
      A role can indicate it is unhealthy
      Upon learning a role is unhealthy
      Current state of the node is updated appropriately
      State machine kicks in again to drive us back into goals state
      Windows Azure FC monitors the health of the host
      If the node goes offline, FC will try to recover it
      If a failed node can’t be recovered, FC migrates role instances to a new node
      A suitable replacement location is found
      Existing role instances are notified of the configuration change
      The FC Keeps Your Service Running
    • FC can upgrade a running service
      Resources deployed to all nodes in parallel
      Done by updating one “update domain” at a time
      Update domains are logical and don’t need to be tied to a fault domain
      Goal state for a given node is updated when the appropriate update domain is reached
      Two modes of operation
      Rollbacks are achieved with the same basic mechanism
      How Upgrades Are Handled
    • Windows Azure provisions and monitors hardware elements
      Compute nodes, TOR/L2 switches, LBs, access routers, and node OOB control elements
      Hardware life cycle management
      Burn-in tests, diagnostics, and repair
      Failed hardware taken out of pool
      Application of automatic diagnostics
      Physical replacement of failed hardware
      Capacity planning
      On-going node and network utilization measurements
      Proven process for bringing new hardware capacity online
      Behind The Scenes Work
    • Your services are isolated from other services
      Can access resources declared in model only
      Local node resources – temp storage
      Network end-points
      Isolation using multiple mechanisms
      Automatic application of windows security patches
      Rolling operating system image upgrades
      Service Isolation And Security
    • FC is a cluster of 5-7 replicas
      Replicated state with automatic failover
      New primary picks up seamlessly from failed replica
      Even if all FC replicas are down, services continue to function
      Rolling upgrade support of FC itself
      FC cluster is modeled and controlled by a utility “root” FC
      Windows Azure FC Is Highly Available
      Client Node
      FC Agent
      FC Core
      FC Core
      FC Core
      Object Model
      Object Model
      Object Model
      Primary FC Node
      Secondary FC Node
      Secondary FC Node
      Replication system
    • Network has redundancy built in
      Redundant switches, load balancers, and access routers
      Services are deployed across fault domains
      Load balancers route traffic to active nodes only
      Windows Azure FC state check-pointed periodically
      Can roll-back to previous checkpoints
      Guards against corrupted FC state, loss of all replicated state, operator errors
      FC state is stored on multiple replicas across fault domains
      Windows Azure Fabric Is Highly Available
    • PDC release
      Automated service deployment
      Three service templates
      Support for changing number of running instances
      Simple service upgrades/downgrades
      Automated service failure discovery and recovery
      External VIP address/DNS name per service
      Service network isolation enforcement
      Automated hardware management
      Include automated network load-balancer management
      For 2009
      Ability to model more complex applications
      Richer service life-cycle management
      Richer network management
      Service Life-cycle
    • Windows Azure automates most functions
      System takes care of running and keeping services up
      Service owner in control
      Self-management model through portal
      Secure and highly-available platform
      Built-in data center management
      Capacity planning
      Hardware and network management
    • Virtualization And Deployment
    • Multi-tenancy with security and isolation
      Improved ‘performance/watt/$’ ratio
      Increased operations automation
      Hypervisor-based virtualization
      Highly efficient and scalable
      Leverages hardware advances
      Virtual Computing Environment
    • High-Level Architecture
      Guest OS
      Server Enterprise
      Guest OS
      Server Enterprise
      Host OS
      Server Core
      Host Partition
    • Images are virtual hard disks (VHDs)
      Offline construction and servicing of images
      Separate operating system and service images
      Same deployment model for root partition
      Image-Based Deployment
    • Image-Based Deployment
      Maintenance OS
      Host Partition
      Guest Partition
      Guest Partition
      Guest Partition
      Application VHD
      Application VHD
      Application VHD
      App1 Package
      App3 Package
      App2 Package
      Host partition differencing VHD
      Guest partition differencing VHD
      Guest partition differencing VHD
      Guest partition differencing VHD
      HV-enabled Server Core base VHD
      Server Enterprise base VHD
      Server Core base VHD
      Server Enterprise base VHD
    • Deployment of images is just file copy
      No installation
      Background process
      Image caching for quick update and rollback
      Servicing is an offline process
      Dynamic allocation based on business needs
      Net: High availability at lower cost
      Rapid And Reliable Provisioning
    • Tech Preview offers one virtual machine type
      Platform: 64-bit Windows Server 2008
      CPU: 1.5-1.7 GHz x64 equivalent
      Memory: 1.7 GB
      Network: 100 Mbps
      Transient local storage: 250 GB
      Windows azure storage also available: 50 GB
      Full service model supports more virtual machine types
      Expect to see more options post-PDC
      Windows Azure Compute Instance
    • Hypervisor
      Efficient: Exploit latest processor virtualization features (e.g., SLAT, large pages)
      Scalable: NUMA-aware for scalability
      Small: Take up little resources
      Host/guest operating system
      Window Server 2008 compatible
      Optimized for virtualized environment
      I/O performance equally shared between virtual machines
      Windows Azure Virtualization
    • Expensive
      SLAT requires less hypervisor intervention associated with shadow page tables (SPT)
      Allow more CPU cycles to be spent on real work
      Release memory allocated for SPT
      SLAT supports large page size (2MB and 1GB)
      Second-Level Address Translation
    • The system is divided into small groups of processors (NUMA nodes)
      Each node has dedicated memory (local)
      Nodes can access memory residing in other nodes (remote), but with extra latency
      NUMA Support
    • NUMA Support
    • NUMA-aware for virtual machine scalability
      Hypervisor schedules resources to improve performance characteristics
      Assign “near” memory to virtual machine
      Select “near” logical processor for virtual processor
      NUMA Scalability
    • NUMA-Aware Scheduler
    • Scheduler
      Tuned for datacenter workloads (ASP.NET, etc.)
      More predictability and fairness
      Tolerate heavy I/O loads
      Intercept reduction
      Spin lock enlightenments
      Reduce TLB flushes
      VMBUS bandwidth improvement
      More Hypervisor Optimizations
    • Automated, reliable deployment
      Streamlined and consistent
      Verifiable through offline provisioning
      Efficient, scalable hypervisor
      Maximizing CPU cycles on customer applications
      Optimized for datacenter workload
      Reliable and secure virtualization
      Compute instances are isolated from each other
      Predictable and consistent behavior
    • Related PDC sessions
      A Lap Around Cloud Services
      Architecting Services For The Cloud
      Cloud Computing: Programming In The Cloud
      Related PDC labs
      Windows Azure Hands-on Labs
      Windows Azure Lounge
      Web site http://www.azure.com/windows
      Related Content
    • Evals & Recordings
      Please fill out your evaluation for this session at:
      This session will be available as a recording at:
    • Please use the microphones provided
    • © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
      The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.