Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

High Availability of Azure Applications


Published on

This is all about details on High Availability of Applications running in Azure. Would cover on fundamentals of High Availability in Azure and discuss in depth on PaaS (High Availability of Web Role and Worker Role).

Published in: Software
  • Be the first to comment

High Availability of Azure Applications

  1. 1. High Availability of Azure Applications(PaaS) Himanshu Sahu Mindfire Solutions
  2. 2. Agenda Introduction Windows Azure Role Architecture Fault Domains in Windows Azure Update Domains in Windows Azure Windows Azure Host OS Updates Windows Azure Guest OS Updates Techniques for High Availability
  3. 3. High Availability in Azure Introduction ALWAYS ON Reliability and Scalability Design for failure Implement separation of function Use a service-oriented architecture
  4. 4. Windows Azure Role Architecture
  5. 5. Fault Domains in Windows Azure Fault Domains Fault Domain is a physical unit of failure, and is closely related to the physical infrastructure in the data centers. In Windows Azure the rack can be considered a fault domain. However there is no 1:1 mapping between fault domain and rack. Windows Azure Fabric is responsible to deploy the instances of your application in different fault domains. Right now Fabric makes sure that your application uses at least 2 (two) fault domains. As a developer have no direct control over how many fault domains your application will use.
  6. 6. Update Domains in Windows Azure Update Domains Upgrade Domain is a logical unit, which determines how particular service will be upgraded. The default number of upgrade domains that are configured for your application is 5 (five). You can control how many upgrade domains your application will use through the upgradeDomain configuration setting in your service definition file (CSDEF).
  7. 7. Windows Azure Host Updates When and Why Windows Azure deploys updates to the host OS approximately once per month. This ensures that Windows Azure provides a reliable, efficient and secure platform for hosting your applications. The HA consists of multiple subcomponents, such as the Network Agent (NA) that manages virtual machine VLANs and the Virtual Machine virtual disk driver that connects Virtual Machine disks to the blobs containing their data in Windows Azure Storage. Azure therefore update the HA and its subcomponents at different intervals, depending on when a fix or new functionality is ready.
  8. 8. Windows Azure Host Updates
  9. 9. Windows Azure Host Updates How The host OS reboots instances and the fabric controller ensures that only instances from one upgrade domain at a time will be rebooted. Virtual machines running on the server that have an Input Endpoint in their role’s service model are removed from the load balancer rotation so that no new requests will come to the virtual machine and instead new requests are sent to other instances of that role as per the Azure load- balancing policies. Each virtual machine hosting a Web or Worker Role receives a Stopping event, whereas VM Roles receive a standard Windows shutdown event. Worker, Web, and Virtual machine roles are allowed five minutes to respond to the stopping and shutdown event before they are forcibly stopped.
  10. 10. Windows Azure Host Updates How After all guest virtual machines are stopped, the root partition OS shuts down and the server reboots. The updated root partition OS starts. The virtual machines hosted on the server boot and start their application code. Virtual machines hosting service roles with Input Endpoints reconnect to the load balancer, enabling them to receive client request
  11. 11. Windows Azure Guest Updates Once the Host OS has finished upgrading across the datacenter then the Guest OS will be upgraded for services which are configured to use automatic Guest OS versions and this upgrade will proceed using standard upgrade domain rules for your service. Your VM will be rebooted and the Windows Partition (the D drive) will be reimaged with the upgraded OS. The Guest OS update process is much faster than the Host OS update since the fabric only has to coordinate the update within your hosted service and your upgrade domains.
  12. 12. Availability An available application considers the availability of its underlying infrastructure and dependent services. Available applications remove single points of failure through redundancy and resilient design Azure SLA More Instances in Azure Make Guest OS Update Manual
  13. 13. Availability Scalability directly affects availability—an application that fails under increased load is no longer available. Scalable applications are able to meet increased demand with consistent results in acceptable time windows. Auto Scaling in Azure
  14. 14. Availability Protection against hardware failures Because every application is made up of multiple instances of each role, hardware failures—a disk crash, a network fault, or the death of a server machine—won’t take down the application. To help with this, the fabric controller doesn’t choose machines for an application’s instances at random. Instead, different instances of the same role are placed in different fault domains. A fault domain is a set of hardware—computers, switches, and more—that share a single point of failure. (For example, all of the computers in a single fault domain might rely on the same switch to connect to the network.) Because of this, a single hardware failure can’t take down an entire application. The application might temporarily lose some instances, but it will continue to behave correctly.
  15. 15. Availability Protection against software failures The fabric controller can also detect failures caused by software. If the code in an instance crashes or the VM in which it’s running goes down, the fabric controller will start either just the code or, if necessary, a new VM for that role. While any work the instance was doing when it failed will be lost, the new instance will become part of the application as soon as it starts running.
  16. 16. Availability The ability to update applications with no application downtime When a new version of the application needs to be deployed, the fabric controller can shut down the instances in just one update domain, update the code for these, then create new instances from that new code. Once those instances are running, it can do the same thing to instances in the next update domain, and so on. While users might see different versions of the application during this process, depending on which instance they happen to interact with, the application as a whole remains continuously available.
  17. 17. Availability The ability to update Windows and other supporting software with no application downtime. Answer is Update Domain. :)
  18. 18. Resources instance-restarts-due-to-os-upgrades.aspx
  19. 19. Questions?
  20. 20. Thank you!