• Save
Taking care of a cloud environment
Upcoming SlideShare
Loading in...5

Like this? Share it with your network


Taking care of a cloud environment



No, this session is not about greener IT. Learn about using the RoleEnvironment and diagnostics provided by Windows Azure. Communication between roles, logging and automatic upscaling of your ...

No, this session is not about greener IT. Learn about using the RoleEnvironment and diagnostics provided by Windows Azure. Communication between roles, logging and automatic upscaling of your application are just some of the possibilities of what you can do if you know about how the Windows Azure environment works.



Total Views
Views on SlideShare
Embed Views



9 Embeds 986

http://blog.maartenballiauw.be 939
http://www.slideshare.net 17
http://cloud.dzone.com 9
http://www.blog.maartenballiauw.be 7
http://translate.googleusercontent.com 6
http://webcache.googleusercontent.com 3
http://cartrackr.codeplex.com 2
http://microsoft.realdolmenblogs.com 2
http://www.dzone.com 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • http://snarfed.org/space/windows_azure_detailshttp://azure.snagy.name/blog/?p=93

Taking care of a cloud environment Presentation Transcript

  • 1. Taking Care of a Cloud Environment: Windows Azure
    Maarten BalliauwRealDolmen
  • 2. Who am I?
    Maarten Balliauw
    Antwerp, Belgium
    Focus on web
  • 3. Agenda
    Windows Azure Environment
    Fabric Controller
    Windows Azure Guest OS
    Fabric Agent
    Diagnostic Monitor
    Interacting with the Environment
    Interacting with the Fabric
    Monitoring and Diagnostics
    Management API
    Bringing it all together: Automatic scaling
  • 4. Windows Azure Environment
    Where will my application live?
  • 5. Windows Azure environment
  • 6. Fabric Controller
    Communicates with every server within the Fabric
    Interacts with a “Fabric Agent” on each machine
    Monitors every VM, application and instance
    Service Management is performed by the Fabric Controller
  • 7. Fabric Controller
    Manages the life cycle of Azure services
    Allocates resources
    Manages VM’s and physical machines in the fabric
    Based on a state machine
    1 heartbeat = comparing services’ goal states with the current node states, tries to move node to goal state if possible
  • 8. Fabric Controller
    Resource allocation based on
    # update and fault domains
    OS features/versions
    Network channels
    Available load balancers
    Resource allocation is transactional
    Deployments and upgrades
    Optional: manual through service portal
    Standard health and failure monitoring
    Reported by Fabric Agent
    Discovered by Fabric Controller
  • 9. Networking
    VIP automatically registered in load balancers
    Load balancer traffic only to role instances in goal state
    Upgrades can be done by VIP swap
    Web Role
    Web Role
  • 10. Windows Azure Environment
  • 11. Windows Azure Environment
    Fabric Controller
    Virtual machine
    Windows Azure Guest OS (http://bit.ly/aZqSdp)
    Fabric agent
    Diagnostic monitor
    Your web/worker role instance
  • 12. Windows Azure Guest OS
    Based on Windows Server 2008 Enterprise
    3 current versions
    Windows Azure Guest OS 1.0 (Release 200912-01)
    Windows Azure Guest OS 1.1 (Release 201001-01)
    Windows Azure Guest OS 1.2 (Release 201003-01)
    Similar environment as W2K8 server
    Performance counters
    Event logs
  • 13. Fabric Agent
    Runs on every node
    Separate process
    Reports current instance’s operational status to FC
    Goal state
  • 14. Diagnostic Monitor
    Runs on every node
    Separate process
    Performs automatic and on-demand diagnostics transfer
  • 15. Interacting with the Fabric
    What can I do to make the most out of it?
  • 16. Interacting with the Fabric
    Trough Microsoft.WindowsAzure.ServiceRuntime.RoleEnvironment
    What it provides...
    Read the deployment id
    Read configuration values from ServiceConfiguration.cscfg
    Get references to local resources
    Request a recycle of the role instance
    Capture events
    Configuration changes
    Status checks (where FC checks FA)
    Get the current role instance
    And a list of all the other role instances in the current role
    And even a list of all the roles in the deployment (i.e. other web/worker roles)
  • 17. Use Cases
    Read the deployment id
    Can be used to use the Management API
    Read configuration values
    Configure your application through ServiceConfiguration.cscfg
    Allow your configuration to be modified through Windows Azure portal
    Get references to local resources
    Local, temporary storage on a role instance
    Use for caching data
    Use for temporary file processing
    Request a recycle of the role instance
    i.e. after a configuration change or a specific event
  • 18. Use Cases
    Capture events
    RoleEnvironment_Changing and RoleEnvironment_Changed
    Respond to changes in the Environment
    Configuration change
    Topology changes
    Inform fabric controller of the current state
    Intercept FA status reporting
    Implement your own status reporting conditions
    What to do when the current role is stopping?
    I.e. unmount of drives, resource cleanup, ...
  • 19. Use Cases
    Get all role instances in the current role
    Status checks
    Know about endpoints
    Inter-role communication?
  • 20. Inter-Role Communication
  • 21. Inter-Role Communication
    Scenario: chat application
    Users get connected to different worker roles
    Worker roles should relay messages to other users
    Implement separate worker roles
    Internal endpoint
    Looping other roles and relaying
  • 22. Monitoring and Diagnostics
    What is my application doing?
  • 23. Diagnostics: Single Server vs. the Cloud
    Single Server
    Static Environment
    Single well-known instance
    Traceable local transactions
    Local Access Feasible
    All in one TS session
    Data & tools co-located
    In-Place Changes
    Dynamic Environment
    Multi-instance, elastic capacity
    Distributed transactions
    Local Access Infeasible
    Many nodes
    Distributed, scaled-out data
    Service Upgrades
  • 24. Monitoring and Diagnostics
    Trough Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitor
    What it provides...
    API for monitoring & data collection for cloud apps
    Support standard diagnostics API
    Manage all role instances or one specific instance
    Scalable: built on WA storage and used by WA components
    Developer in control
  • 25. Windows Azure Diagnostics
    Role Instance
    Data collection
    (traces, logs, crash dumps)
    Quota enforcement
    Diagnostic Monitor
    Local directory storage
    Windows Data Sources
    IIS Logs & Failed Request Logs
    Perf Counters
    Windows Event Logs
  • 26. Windows Azure Diagnostics
    Request upload
    Role Instance
    Windows Azure Storage
    Diagnostic Monitor
    Local directory storage
    Windows Data Sources
    Scheduled or on-demand upload
  • 27. Windows Azure Diagnostics
    Windows Azure
    Hosted Service
  • 28. Development
    Windows Azure Diagnostics
    Windows Azure
    Hosted Service
    Diagnostic Manager
    Some diagnostics application
    Controller Code
  • 29. Feature Summary
    Local data buffering
    Configurable trace, perf counter, Windows event log, IIS log & file buffering
    Local buffer quota management
    Query & modify config from the cloud or from the desktop per role instance
    Transfer to WA Storage
    Scheduled & on-demand
    Filter by data type, verbosity & time range
    Transfer completion notification
    Query & modify from the cloud and from the desktop per role instance
  • 30. Feature Matrix
  • 31. Sample: Activate WA Diagnostics
    // This is done for you automatically by
    // Windows Azure Tools for Visual Studio
    // Add a reference to Microsoft.WindowsAzure.Diagnostics
    // Activate diagnostics in the role's OnStart() method
    // Use the connection string contained in the
    // application configuration setting named
    // "DiagnosticsConnectionString”
    // If the value of this setting is
    // "UseDevelopmentStorage=true" then will use dev stg
  • 32. Sample: Web.Config Changes
    This is automatically inserted by VS.The listener routes
    System.Diagnostics.Trace messages to
    Windows Azure Diagnostics.
    <addtype="Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener, Microsoft.WindowsAzure.Diagnostics, Version=, Culture=neutral, PublicKeyToken=31bf3856ad364e35" name="AzureDiagnostics">
    <filtertype="" />
  • 33. Sample: Generate Diagnostics Data
    stringmyRoleInstanceName =
    // Trace with standard .Net tracing APIs
    "Informational trace from " + myRoleInstanceName);
    // Capture full crash dumps
    // Capture mini crash dumps
  • 34. Sample: Enable Local Data Buffering
    // Managed traces, IIS logs, failed request logs,
    // crashdumps and WA diags internal logs are buffered
    // in local storage by default. Other data sources must be
    // added explicitly
    DiagnosticMonitorConfigurationdiagConfig =
    // Add performance counter monitoring
    PerformanceCounterConfigurationprocTimeConfig = new
    // Run typeperf.exe /q to query for counter names
    procTimeConfig.CounterSpecifier =
    @"Processor(*) Processor Time";
    procTimeConfig.SampleRate = System.TimeSpan.FromSeconds(1.0);
    // Continued on next slide...
  • 35. Sample: Enable Local Data Buffering
    // Continued from previous slide...
    // Add event collection from the Windows Event Log
    // Syntax: <Channel>!<xpath query>
    // http://msdn.microsoft.com/en-us/library/dd996910(VS.85).aspx
    // Restart diagnostics with this custom local buffering
    // configuration
  • 36. Sample: Web.Config Changes
    You can optionally enable IIS failed request tracing.
    This has some performance overhead
    A service upgrade is required to toggle this setting.
    <addprovider="ASP"verbosity="Verbose" />
    verbosity="Verbose" />
    <addprovider="ISAPI Extension"verbosity="Verbose"/>
    <addprovider="WWW Server"verbosity="Verbose"/>
  • 37. Sample: Scheduled Data Transfer
    // Start off with the default initial configuration
    DiagnosticMonitorConfiguration dc =
    dc.WindowsEventLog.ScheduledTransferPeriod =
    DiagnosticMonitor.Start("DiagnosticsConnectionString", dc);
  • 38. Sample: On-Demand Data Transfer
    // On-Demand transfer of buffered files.
    // This code can live in the role, or on the desktop,
    // or even in another service.
    varddm = newDeploymentDiagnosticManager(
    varridm = ddm.GetRoleInstanceDiagnosticManager(
    vardataBuffersToTransfer = DataBufferName.Logs;
    OnDemandTransferOptionstransferOptions =
    transferOptions.From = DateTime.MinValue;
    transferOptions.To = DateTime.UtcNow;
    transferOptions.LogLevelFilter = LogLevel.Critical;
    GuidrequestID = ridm.BeginOnDemandTransfer(
  • 39. Cerebrata Diagnostics Manager
  • 40. Storage Considerations
    Standard WA Storage costs apply for transactions, storage & bandwidth
    Data Retention
    Local buffers are aged out by the Diagnostic Monitor according to configurable quotas
    You control data retention for data in table/blob storage
    You should manage cleanup of this!
    Query Performance on Tabular Data
    Partitioned by high-order bits of the tick count
    Query by time is efficient
    Filter by verbosity level at transfer time
  • 41. Common Diagnostic Tasks
    Performance measurement
    Resource usage
    Troubleshooting and debugging
    Problem detection
    Quality of Service Metrics
    Capacity planning
    Traffic analysis (users, views, peak times)
  • 42. Management API
    How do I manage my deployments?
  • 43. Management API
    Trough Microsoft.Samples.WindowsAzure.Management.*
    What it provides...
    X509 client certificates for authentication
    View, create, delete, swap, … deployments
    Edit configuration (and change instance count)
    List and view properties for hosted services, storage accounts and affinity groups
    Also exists as
    PowerShell scripts
    Msbuild tasks (CI & auto-deploy anyone?)
  • 44. Using the management API
  • 45. Auto-Scaling
    Bringing it all together
  • 46. Auto-Scaling
    As easy as doing this?
    Unfortunately: no…
    “When” should it scale?
    “How” should it scale?
    “Who” / “What” is responsible for scaling?
    <InstancesminInstances="3" maxInstances="10" />
  • 47. Auto-Scaling – “When”
    Different for every application
    Based on performance counters
    Based on queue length / workload
    Based on the weather?
    Weight of metrics
    Trends in metric data
    “Scaling logic provider”
  • 48. Auto-Scaling - Sensors
    Sensors provide metrics and trend
    Performance counter
    Queue length
  • 49. Auto-Scaling – “How”
    Average topology change takes up to 15 minutes
    What if your load goes up too fast?
    Weight of metrics
    Trends in metric data
    “Scaling logic provider”
  • 50. Auto-Scaling – Scaling logic
    Scaling logic provider uses sensor data to suggest an action (up/fast-up/down/stable)
    To implement per application
    Just a suggestion!
  • 51. Auto-Scaling – “Who”/”What”
    A dedicated server / worker role?
    At least two workers for WA SLA: costs!
    The application itself?
    Master election (which role instance responsible?)
    Approach will differ per application…
  • 52. Auto-Scaling - Responsabilities
    More approaches are possible
    Dedicated worker
    On-premise monitoring app
    The app itself
    Master election based on RoleEnvironment.Roles
  • 53. Auto-Scaling
  • 54. Auto-Scaling Demo
    Scaling based on custom sensor
    # users logged in
    Monitoring done by the app itself
    Which brings everything together:
    Master election  Role Environment
    Performance counters  Diagnostics API
    Queue length  Storage API
    Scaling (changing # instances in config)  Management API
  • 55. Takeaways
    What to remember?
  • 56. Takeaways
    Windows Azure Environment components
    Fabric Controller
    Windows Azure Guest OS
    Fabric Agent
    Diagnostic Monitor
    All components provide interaction
    Interacting with the Fabric
    Monitoring and Diagnostics
    Management API
    Bringing it all together gives you power!
  • 57. Q&A
    Any questions?
  • 58. Thank you!