Your SlideShare is downloading. ×
Windows Azure Platform: Articles from the Trenches, Volume One
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Windows Azure Platform: Articles from the Trenches, Volume One


Published on

Developers have been exploring the possibilities opened up by the Windows Azure Platform for Cloud Computing. This book pulls together great articles from many of those developers who have been active …

Developers have been exploring the possibilities opened up by the Windows Azure Platform for Cloud Computing. This book pulls together great articles from many of those developers who have been active with the Windows Azure Platform to hopefully help others become successful. There are twenty articles in this first volume covering everything from getting started to implementing best practices for elastic applications.

Published in: Technology
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. The Windows Azure Platform: Articles from the Trenches Volume One Editor and copy and paste guru: Eric Nelson and 15 authors smarter than him 22nd June 2010 (v0.9) Cover art by Andrew Fryer Developers have been exploring the possibilities opened up by the Windows Azure Platform for Cloud Computing. This book pulls together great articles from many of those developers who have been active with the Windows Azure Platform to hopefully help others become successful. There are twenty articles in this first volume covering everything from getting started to implementing best practices for elastic applications.
  • 2. The Windows Azure Platform: Articles from the Trenches TABLE OF CONTENTS INTRODUCTION 6 From the Editor 6 Would you like to become an author for a future edition? 6 Introduction to the Windows Azure Platform 7 AE – Acronyms Explained  8 CHAPTER 1: GETTING STARTED 9 5 steps to getting started with Windows Azure 9 Step 1: Creating an Azure account. 9 Step 2: Provisioning a SQL Azure database 9 Step 3: Building a Web Application for Azure 10 Step 4: Packaging the Web Application for Windows Azure 11 Step 5: Deploying the Web Application to Azure. 11 The best tools for working with the Windows Azure Platform 14 Category: The usual suspects 14 Category: Windows Azure Storage 14 Category: Windows Azure diagnostics 17 Category: SQL Azure 18 Category: General Development 19 CHAPTER 2: WINDOWS AZURE PLATFORM 20 Architecting For Azure – Building Highly Scalable Applications 20 Principles of Azure Architectures 20 Partition Data 20 Colocation 21 Cache 21 State 21 Distribute Workloads Effectively 22 Maximise Resources 22 Summary 23 The Windows Azure Platform and Cost-Oriented Architecture 24 Cost is important 24 What costs to consider 24 Conclusion 25 De-risking Your First Windows Azure Project 26 Popular Risks 26 Non-Technical Tactics for Reducing Risk 27 Technical Tactics for Reducing Risk 28 2
  • 3. The Windows Azure Platform: Articles from the Trenches Developer Responsibility 29 Trials & tribulations of working with Azure when there’s more than one of you 30 Development Environment 30 Test Environment 30 Certificates 31 When things go wrong 31 Summary 31 Using a Continuous Integration build to achieve an automated deployment of your latest build 32 Getting the right “bits” 32 Packaging for deployment 32 Deploying 33 Using Java with the Windows Azure Platform 35 Accessing Windows Azure Storage from Java 35 Running Java Code on Windows Azure 36 AzureRunme 37 CHAPTER 3: WINDOWS AZURE 39 Auto-Scaling Windows Azure Compute Instances 39 Introduction 39 A Basic Approach 39 The Scale Agent 39 Monitoring: Retrieving Diagnostic Information 40 Rules: Establishing When To Scale 41 Trust: Authorising For Scale 42 Scaling – The Service Management API 44 Summary 45 Building a Content-Based Router Service on Windows Azure 46 Bing Maps Tile Servers using Azure Blob Storage 49 Azure Drive 51 Guest OS 51 VHD 51 CloudDrive 52 Development Environment 53 Azure Table Service as a NoSQL database 55 Master-Detail structures 55 Dynamic schema 55 Column names as data 56 Table names as data 56 3
  • 4. The Windows Azure Platform: Articles from the Trenches Summary 57 Queries and Azure Tables 58 CreateQuery<T>() 58 Contexts 59 Querying on PartitionKey and RowKey 59 Continuation 60 DataServiceQuery 60 CloudTableQuery 61 Tricks for storing time and date fields in Table Storage 64 Using Worker Roles to Implement a Distributed Cache 68 Configuring the Cache 68 Using the Distributed Cache 69 Logging, diagnostics and health monitoring of Windows Azure Applications 71 Collecting diagnostic data 71 Persisting diagnostic data 72 Analysing the diagnostic data 72 More information 73 Service Runtime in Windows Azure 74 Roles and Instances 74 Endpoints 74 Service Upgrades 74 Service Definition and Service Configuration 75 RoleEntryPoint 75 Role 76 RoleEnvironment 76 RoleInstance 77 RoleInstanceEndpoint 78 LocalResource 78 CHAPTER 4: SQL AZURE 79 Connecting to SQL Azure in 5 Minutes 79 Prerequisite – Get a SQL Azure account 79 Working with the SQL Azure Portal 79 Create a database through the Server Administration 80 Configuring the firewall 80 Connecting using SQL Server Management Studio 81 Application credentials 83 Keep in mind – the target database 83 CHAPTER 5: WINDOWS AZURE PLATFORM APPFABRIC 85 4
  • 5. The Windows Azure Platform: Articles from the Trenches Real Time Tracing of Azure Roles from Your Desktop 85 Custom Trace Listener 85 Send Message Console Application 86 Trace Service 86 Service Host Class 87 Service 88 Summary 88 MEET THE AUTHORS 90 Eric Nelson 90 Marcus Tillett 90 Richard Prodger 91 Saksham Gautam 91 Steve Towler 92 Rob Blackwell 92 Juliën Hanssens 92 Simon Munro 93 Sarang Kulkarni 93 Steven Nagy 93 Grace Mollison 94 Jason Nappi 94 Josh Tucholski 95 David Gristwood 95 Neil Mackenzie 96 Mark Rendle 96 5
  • 6. The Windows Azure Platform: Articles from the Trenches INTRODUCTION FROM THE EDITOR Hello all, The Windows Azure Platform is changing the way we architect, implement, deploy and manage solutions. In early 2010 it went live and in the first six months we have already seen an impressively diverse range of solutions developed to take advantage of the services offered. This book pulls together great articles from many of those developers who have been active with the Windows Azure Platform to hopefully help others be successful. There are twenty articles in this first volume covering everything from getting started to implementing best practices for elastic applications. You are not expected to read it in order from start to finish. Instead I would encourage you to head straight to the chapters or the individual articles that look most relevant or interesting. The book was put together in May and early June 2010 which means that it pre-dates the 1.2 release of the Windows Azure SDK. The 1.2 released adds some great new features, especially for Visual Studio 2010 and .NET Framework 4.0 in areas such as debugging and IDE integration. Volume Two of this book will cover off those new features (and more!) Once you have had a chance to look at the articles please give us your feedback at (It should take less than one minute). Thank you and happy reading. Eric Nelson Developer Evangelist, Microsoft UK Website: Email: Blog: Twitter: WOULD YOU LIKE TO BECOME AN AUTHOR FOR A FUTURE EDITION? Developers value the sharing of best practices, knowledge and experiences – knowledge and experiences such as your own. If you have insight into the Windows Azure Platform then you are a great candidate for becoming an author involved in the next volume of this book as the Windows Azure Platform continues to evolve and broaden. Please email me ( with your proposed article(s) and if possible a “sample of your work” such as a link to your blog. 6
  • 7. The Windows Azure Platform: Articles from the Trenches INTRODUCTION TO THE WINDOWS AZURE PLATFORM The Windows Azure Platform contains three technologies which can be used individually or together to build solutions which run “in the cloud”. For the first time you are able to run your code and store your data in Microsoft datacenters and let Microsoft take on some of the responsibility for keeping your solution running great and able to respond to the changing demands of business. Solutions can either run entirely on the Windows Azure Platform or as a hybrid, with some of the solution running on-premise or elsewhere on the Internet. The three key technologies are Windows Azure, SQL Azure and Windows Azure Platform AppFabric: Windows Azure  Windows Azure is the cloud services operating system for the Windows Azure Platform. Windows Azure provides developers with on-demand compute and storage to run your code and store your data.  Windows Azure supports a consistent development experience through its integration with Visual Studio 2008 and Visual Studio 2010. Windows Azure is an open platform that supports both Microsoft and non-Microsoft languages and technologies. Windows Azure welcomes third-party tools and technologies such as Eclipse, Ruby, PHP, and Python. SQL Azure  Microsoft SQL Azure delivers the capabilities of Microsoft SQL Server to Windows Azure applications or applications running outside of the Windows Azure Platform. It can store and retrieve structured, semi-structured, and unstructured data with the advantage of high availability through the storage of multiple copies of your data. It enables relational queries, search, and data synchronization with mobile users, remote offices and business partners. Windows Azure Platform AppFabric  AppFabric provides secure connectivity as a service to help developers bridge cloud, on- premise, and hosted deployments. AppFabric comprises Service Bus and Access Control. From simple eventing scenarios to complex protocol tunneling, AppFabric Service Bus gives developers the flexibility to choose how their applications communicate; addressing the challenges presented by firewalls, NATs, dynamic IP, and disparate identity systems. AppFabric Access Control enables simple, secure authorization for RESTful web services that federate with a variety of identiy providers. There are many articles, videos and screencasts designed to help you get up to speed with the Windows Azure Platform and a great place to start is We also have a Getting Started chapter within this book. 7
  • 8. The Windows Azure Platform: Articles from the Trenches AE – ACRONYMS EXPLAINED  If you are new to the Windows Azure Platform then you may need a little help with some of the acronyms and industry terms used in this book.  REST and RESTful - Representational State Transfer. A style of software architecture to enable clients and servers to interact.  WCF – Windows Communication Foundation. A technology shipped initially in .NET Framework 3.0 to allow communication to take please between code running in different “locations”.  Cloud Computing – running of code and storage of data off-premise. (Also see the 100+ alternative definitions of Cloud Computing  e.g. )  Elastic Computing –as more processing power is needed or as more data needs to be stored, elastic computing (in our case the Windows Azure Platform) promises to rapidly respond to those demands and provision out additional compute and storage resources.  PaaS – Platform as a Service is one approach to Cloud Computing that favors abstraction and simplicity over flexibility e.g. the Windows Azure Platform.  IaaS – Infrastructure as a Service is one approach to Cloud Computing that favors flexibility over abstraction and simplicity e.g. Amazon Web Services.  Codename “Dallas” – a 4th member of the Windows Azure Platform, currently in CTP.  CTP – Community Technology Preview. In simple terms – not quite as solid as a traditional Beta  8
  • 9. The Windows Azure Platform: Articles from the Trenches CHAPTER 1: GETTING STARTED 5 STEPS TO GETTING STARTED WITH WINDOWS AZURE By Jason Nappi Getting started with a new technology can be daunting, but generally once you get going things become familiar and learning accelerates. Therefore, I’d like to focus on providing a few of the basic steps that I recently went through in the hope that it will both answer some of the basic questions and knock down some of the barriers to accelerated learning. The following are some of the primary design considerations for what I think of as a typical business application, and the implications of building those same types of applications in the Azure cloud. STEP 1: CREATING AN AZURE ACCOUNT. The first step, as you might imagine, is to set up an Azure account. Since Windows Azure is a cloud service, you’ll need to create an account in the cloud, and provision a cloud environment. You can create an Azure account at the Windows Azure Developer portal. This is a pretty straightforward registration process that will require you to create a Windows Live ID if you don’t already have one and will require a credit card. At the conclusion of the registration process you should have access to Windows Azure, SQL Azure and AppFabric. At this point you haven’t created any cloud services; you’ve only created an account under which the services you create can be provisioned and deployed. STEP 2: PROVISIONING A SQL AZURE DATABASE This step may not be required by everyone, but most of the applications I’ve built have been database driven. Given that, whether creating a new application or moving an existing one to the cloud, I think it’s going to be a fairly common question to ask where the database lives and how you connect to it. The reasonable answer is that if my application is going to be hosted in the cloud, my database needs to be in the cloud too. The Windows Azure Platform provides Windows Azure Storage as well as SQL Azure for storing data. SQL Azure is most similar to the relational databases of the typical business application, so while Azure Storage may have scalability and cost advantages, SQL Azure provides the more familiar paradigm. Naturally I’m inclined towards SQL Azure to get started. In order to create my cloud database I’ll need to return to the Azure account that I set up in step 1 and navigate to the SQL Azure section of the portal To create a SQL Azure server, you’ll need to provide a username and password and the SQL Azure Developer Portal will create a server using a generated unique name similar to With the SQL Azure server created, you can now create the database. There is also an additional requirement that you configure firewall rules to allow access. Again, for the sake of simplicity, you can just grant your local machines IP address access to the SQL Azure server. 9
  • 10. The Windows Azure Platform: Articles from the Trenches Lastly, you might be wondering, as I did, whether the newly created SQL Azure database is accessible via the familiar SQL Server Management Studio Tools. I was able to successfully connect after downloading SQL Server Management Studio 2008 R2. STEP 3: BUILDING A WEB APPLICATION FOR AZURE Having provisioned our cloud database and proven that you can connect to it with familiar SQL Server Management studio tools, and assuming you’ve created the tables required by your application, you’re ready to begin building your application. In order to do so you’ll need to install the Windows Azure SDK and the Windows Azure Tools for Microsoft Visual Studio 1.1. The good news about both of these is that they support Visual Studio 2008 and Visual Studio 2010. Once you fire up Visual Studio you’ll notice a new project template for “Windows Azure Cloud Service”. After choosing the cloud service template you will be prompted to choose from one of the cloud service ‘roles’; Web, Worker and WCF Service Roles. Assuming you’ve chosen “ASP.NET Web 10
  • 11. The Windows Azure Platform: Articles from the Trenches Role”, a solution containing two projects, a cloud services project and the familiar ASP.NET Web project, will be created. The only real difference between a standard ASP.NET web project and the ASP.NET Web Role project is the existence of a WebRole.cs file. The WebRole.cs serves as the entry point for Azure. When you hit F5 your Azure application starts up and runs inside the development Fabric. The Development Fabric simulates the Windows Azure cloud environment enabling you to run, test and debug Azure applications on the desktop! STEP 4: PACKAGING THE WEB APPLICATION FOR WINDOWS AZURE Packaging up the application for publishing to Azure turns out to be fairly simple. From within Visual Studio you can right click on the Cloud Services project and choose Publish from the context menu. This will package the web application into a .cspkg file, and also create the ServiceConfiguration.cscfg file. These two files are all you need to deploy your application to Windows Azure. STEP 5: DEPLOYING THE WEB APPLICATION TO AZURE. Now that you’ve packaged your ASP.NET Web Role, you’ll need to return to the Windows Azure account you created in Step 1 and create your Windows Azure service. Under the Windows Azure tab choose “new service””Hosted Service” and provide a name and description for your new cloud service. Once the Service is created there’ll be two hosted service locations, staging and production. Under each will be a ‘Deploy’ button. Choose Deploy under Staging. This will bring up a screen asking for the two files created in Step 4. Provide both files, and deploy. After deploying the package and the configuration you’ll be provided with a unique url for accessing your application. Now you’ll also see that you have the ability to ‘Run’ the service. 11
  • 12. The Windows Azure Platform: Articles from the Trenches The application won’t be accessible via the url until you Run it, so press Run, and wait for it, wait for it, wait for it…it takes a while to provision the Windows Azure infrastructure for your application, but once you get the green light you should be good to go. 12
  • 13. The Windows Azure Platform: Articles from the Trenches These are just a few of the baby steps I’ve taken to become familiar with Windows Azure. With these steps I’ve been able to demonstrate that developing for Windows Azure is largely the same development experience that I’m accustomed to. However, one of the more intriguing considerations when building for Windows Azure is the potential use of Windows Azure Storage as a data store instead the more conventional relational database provided by SQL Azure. 13
  • 14. The Windows Azure Platform: Articles from the Trenches THE BEST TOOLS FOR WORKING WITH THE WINDOWS AZURE PLATFORM By Sarang Kulkarni “A platform is known by the tooling available around it!” Much clichéd but still holds true. Windows Azure, though a fairly nascent cloud platform is aptly supported by some fantastic tooling which make development fun and a developer’s life easy. Let us get the usual suspects out of the way first to make way for some more interesting kids on the block, many of which I cannot do without. CATEGORY: THE USUAL SUSPECTS Microsoft Visual Studio 2010® Visual Studio 2010 (VS2010) is a stable development platform for Windows Azure. Though there are very few changes specific to Azure when compared with VS2008, the overall development experience is definitely superior. Windows Azure VMs support .Net Framework 4.0 from OS Version 1.2 and therefore it makes sense to use VS2010 to take advantage of the new features of .Net 4.0 in the cloud. As always, the Express edition is free. Microsoft SQL Server Management Studio® 2008 R2 The R2 release is recommended for working with SQL Azure. The biggest advantage being the comfort of an SQL IDE we have grown up with. I don’t think I need to wax poetic about this one, this is Bread and Butter. Again Express edition is free and recommended as it serves most of the needs. Download it from: E81B33D8026B&displaylang=en. User Accounts and Local Security Policy Control Panel applets I know there’s nothing specific to Azure here. But it comes very handy to have a user with permissions as laid out at to avoid any surprises related to user rights while running in the fabric. CATEGORY: WINDOWS AZURE STORAGE What: Cerebrata - Cloud Storage Studio Why: Cerebrata Cloud Storage Studio (CSS) is a WPF based client for managing Azure Storage, as well as hosted applications. CSS started as a commendable effort by a small firm to provide an intuitive visual access to the Azure Storage putting the Storage APIs to good use. It now stands as a one stop solution to manage everything under the Azure Storage, as well as a lot of things in the hosted 14
  • 15. The Windows Azure Platform: Articles from the Trenches applications. Figure 1: Cloud Storage Studio - Connect to Azure Account You can design a table schema in CSS, perform CRUD operations on existing tables, download/upload table contents to/from the disk and filter table contents. Basic querying support is also provided which supports the WCF Data Services (formally ADO.NET Data Services) query syntax. Linq query support would have been a welcome add-on. Blob storage is a forte of CSS and all possible operations on Blobs and Containers are available. You can create containers, configure access policies, list blobs in a container replete with the folder structure, upload/download page/block blobs, rename, copy and move blobs, create and view blob snapshots (Very useful), create signed URL for a blob. MIME type configuration support is icing on the already nice cake. My only grudge is the very basic breadcrumb while navigating the container structure. CSS also features a simple yet effective service management UI. The design closely resembles that of the actual azure developer portal. The same features are offered plus a few more. The regular service management operations like connecting to hosted services, view, deploy, delete services, swap deployment slots, manage API certificates and manage affinity groups are available. A very useful feature we find here is a nifty little checkbox at the bottom of the create service deployment dialog which reads “Automatically run the deployment after creation” – a nice touch. 15
  • 16. The Windows Azure Platform: Articles from the Trenches Figure 2: Cloud Storage Studio - Deploy a Service It costs a totally worthwhile 60$ per license. Notable alternatives are  Cerebrata’s own CSS/e which is a Silverlight application providing very basic but useful Storage Service administration  the open source Azure Storage Explorer  Finally, the far from perfect yet still useful open source alternative Azure MMC Snap-in Azure MMC in its second version and covers almost all bases as the Cloud Storage Studio and deserves a worthy mention. Figure 3: Windows Azure MMC 16
  • 17. The Windows Azure Platform: Articles from the Trenches What: LINQPad Why: It would not be an overstatement to term LinqPad by Joseph Albahari to be the best querying scratchpad available for Linq. LINQPad can query a varied set of data sources. Of particular interest to this discussion are SQL Azure, WCF Data Services (Think codename “Dallas”) and Windows Azure Table Storage. Yes Table storage! LINQPad steps in where Cloud Storage Studio stops being adequate - the querying capabilities are superior and the interface more powerful. Figure 4: LinqPad - Sample Query on the WADPerformanceCounters table As usual some of the best tools come free and LinqPad surely fits the definition. There is also a pro version available with some bells and whistles like auto-complete, Visual Studio integration etc. CATEGORY: WINDOWS AZURE DIAGNOSTICS What: Cerebrata – Azure Diagnostics Manager Why: Azure diagnostics has taken some time to reach the final form we see it in today. There are few tools which provide the comfort of an Event Viewer or a comprehensive management dashboard for working with the diagnostic data. Azure Diagnostics Manager (in public beta at the time of writing) attempts to achieve just that. The feature set is fairly comprehensive covering the following:  You can either connect to an Azure storage account to read the diagnostics information and find the deployments from there and connect to the listed deployments or choose to connect directly to a subscription and get a list of hosted services to monitor.  The Dashboard provides a bird’s eye view of all the diagnostic information collected. One may choose to view Event Viewer, Trace Logs, Infrastructure Logs, Performance Counters, IIS Logs, IIS Failed Request Logs, Crash Dumps and On Demand Transfer. 17
  • 18. The Windows Azure Platform: Articles from the Trenches  If you have only deployed a service and are collecting none of these, fret not. Azure Diagnostic monitor also provides access to the diagnostic monitor inside your Roles as well as individual role instances through the Remote Diagnostics API. With this you can enable/disable any of the diagnostic information being collected or you can alter the verbosity/frequency. Figure 5: Azure Diagnostics Manager - Performance Counter Graphs CATEGORY: SQL AZURE What: SQL Azure migration wizard Why: As most of us working with cloud solutions might have already noticed, the largest chunk of the work coming to the System Integrators is the migration of existing applications to cloud. One of the key aspects of this is database migration. SQL Azure migration wizard helps simplify database migration. With the SQL Azure Migration Wizard we can analyze scripts for SQL Azure compliance, generate scripts and can migrate databases – schema and data. Migration is supported from SQL Server to SQL Azure, SQL Azure to SQL Server and SQL Azure to SQL Azure. Even in its 3.2.2 version it still has its share of quirks but is vastly improved and great for the mundane tasks in DB migration. 18
  • 19. The Windows Azure Platform: Articles from the Trenches CATEGORY: GENERAL DEVELOPMENT What: Fiddler Why: Fiddler is a Web Debugging proxy. It allows us to inspect all incoming and outgoing HTTP(S) traffic on a machine. This is particularly helpful while working with the Azure Storage, Azure Service Management API, Remote Diagnostics Manager API and anything REST. Looking at the HTTP traffic gives an insight into how the Requests/Responses are constructed, what Responses are received and a host of other information that every web service developer/consumer will find handy. Figure 6: Fiddler – Statistics Fiddler scripting engine can be used to filter in/out requests and/or responses and also issue preconfigured responses. Fiddler can also target specific processes to filter traffic only from those processes. Fiddler provides an API which can be used in a .Net application to programmatically track network traffic and use almost all of Fiddler’s features. This has enabled some nifty Fiddler Extensions like Watcher - A Passive Security Audit tool , Chad Oswald’s Request to Code which gives the required code to issue captured http requests and the JSON Viewer which visualizes JSON objects. 19
  • 20. The Windows Azure Platform: Articles from the Trenches CHAPTER 2: WINDOWS AZURE PLATFORM ARCHITECTING FOR AZURE – BUILDING HIGHLY SCALABLE APPLICATIONS By Steven Nagy Two key reasons organisations move to the cloud are to reduce cost and leverage economies of scale. Unfortunately not every type of application is suited to the cloud, and more often than not, those that are suited for the cloud are not architected for scalability. Further, the Windows Azure Platform has a pricing model that if not considered during your architecture phase, can negate the cost benefits of moving to the cloud to begin with. This article will address the key things to consider when architecting highly scalable applications that are cost-optimised for the Azure platform. PRINCIPLES OF AZURE ARCHITECTURES The Windows Azure Platform already provides elasticity, redundancy, and abstractions from the distributed platform on which it is run. This gives us a flying head start when designing systems for the cloud, but there are still key measures we need to take to ensure our application doesn’t become its own worst enemy. Here we define five key tenets to keep in mind throughout the design and implementation phases of your project. PARTITION DATA Data partitioning is not a new concept 1. Traditionally it has helped us break up massive databases into smaller more manageable pieces, and to improve query performance by splitting unrelated data into different partitions. In scalable applications it is important for those same reasons, but also allows us to scale more effectively; imagine serving 500 requests per minute on a single database versus 50 requests per minute across 10 databases. Furthermore, storage is cheap. Consider Sql Azure pricing versus Azure Table Storage 2 for 1Gb storage: $10 and $0.15 per month respectively. Both are at least 3 times redundant. However not only is Azure Table Storage cheaper, it has inbuilt partitioning mechanisms that allow you to allocate every single entity (row) of data to a horizontal partition (or shard 3) based on the partition key you provide. In Table Storage, each partition is a physically different storage node, which means queries and requests can scale extremely efficiently. If you don’t have complex relational queries, this is the ideal choice. Denormalising your data can help immensely by removing those relationships and allowing ease of partitioning. This is essentially the premise of the ‘NoSql’ movement 4. You should also consider data duplication for further performance increases. Consider a search function for customers by age demographic or by city; by having two copies of the data in different 1 2 3 4 20
  • 21. The Windows Azure Platform: Articles from the Trenches partitions, your query and retrieval time is highly efficient. The flip side to this approach is the added complexity to managing multiple copies of data. Partitioning support in Azure can be summarised as follows:  Table entities are horizontally partitioned on partition key  Blobs are partitioned based on their container  Queues are partitioned on a per-queue basis  Sql Azure supports no partitioning Vertical partitioning is not supported by default however it makes sense to store smaller amounts of data together when the additional fields are not needed on the majority of requests. COLOCATION Sql Azure, Azure Storage, Azure Compute roles, and the AppFabric all have bandwidth costs for data moving in and out of the data centre. It makes sense to keep this in mind when building our applications. Azure already lets us choose our data centres and more importantly, we can co-locate components of our system via Affinity Groups such that network traversal is minimal and faster. Luckily this is a deployment consideration and not so important with up front design. CACHE A more important consideration is the various opportunities to utilise caching mechanisms. There are many ways that cache can be harnessed to minimise transactions; from end user http requests, for underlying data stores, or memoization5 purposes. When almost everything in the platform is accessible via a REST interface, it pays to invest effort into caching. Some cache concepts to consider are:  Client side timed cache – content that expires after a certain amount of time, preventing client browsers from requesting a page, serving a local copy instead  Entity Tags6 (ETags) - Allow you to specify a ‘version’ in a http header field; server can indicate the version has not changed, in which case no other data is exchanged, otherwise can return all the data for that request  ASP.Net Page level Cache  Distributed Cache7 - has multiple nodes that either all share the same content (shared everything) or have unique sections of the cache (shared nothing); shared everything distributed caches work well in Azure because of the throwaway nature of commodity hardware and ease of scale STATE 5 6 7 21
  • 22. The Windows Azure Platform: Articles from the Trenches State has often been cast as the enemy of concurrent programming and the same applies at higher levels of abstraction as well, such as multiple compute instances. Mutable state requires locking and tracking in concurrent environments, which adds overhead and complexity to applications. Therefore reducing, or even removing state is an ideology worth pursuing. Sometimes state is specific to a single user, such as session state. Load balancers in the Azure data centres are round-robin, therefore as soon as you have more than one web front end you can no longer store session state in process (default); if session state is critical to your application, look to move it to Sql Azure or Table Storage instead. However session state is typically abused and is generally not actually required for the situations in which it is used. As an alternative to sessions, consider claim based security, such that any page request is accompanied by a set of claims. The AppFabric Access Control Services can assist with this. DISTRIBUTE WORKLOADS EFFECTIVELY Typically when multiple sources need to access a resource there is a level of contention. Locks and leases need to be taken and other threads are blocked until contention is resolved. As with state, this problem exists in all forms of concurrent programming, and is as important in multi-instance work sharing scenarios. Worker roles need to pick up items for processing, but when there are multiple instances of the same worker role, how do we ensure that each instance does not pick up the same work item? The ‘Asynchronous Work Queue Pattern’ is one such solution. By providing a robust, redundant queuing mechanism that guarantees unique distribution of work items, the workers are ignorant of leases and locks and can focus on the job of processing work items. Such a queue will be reusable for many different work types, and the Windows Azure Storage Queue service is an ideal candidate. There are other messaging architectures that allow us to decouple our components. AppFabric allows a ‘NetEventRelayBinding’ for Publish/Subscribe scenarios, for example. MAXIMISE RESOURCES One could argue that if your CPU is not at 100% it is being underutilised. In Azure you pay for the core regardless of usage, so it makes sense to get the most bang for your buck. When using worker roles, multi-threaded architectures are often forgotten. Since adding another instance means an additional hourly cost, first ensure you are getting the most out of your current instances. If your worker (or web role for that matter) has lots of IO work, it makes sense to use multiple threads. Auto-scaling resources is worth investigating also. Typically an IT department will maintain enough servers to cope with their peak periods; consider instead starting at trough capacity, and use auto- scaling functionality to add instances dynamically. When load starts to taper off, start scaling down, cutting costs as you do. 22
  • 23. The Windows Azure Platform: Articles from the Trenches Currently you can utilise content delivery services (CDN) to push blobs out to localised edges. This will help improve latency for your customers. Also consider what could qualify for blob storage; essentially anything static is a contender:  PDF, Word documents  Videos  Website images  Website CSS and JavaScript libraries  Any static HTML website pages  Silverlight files (XAP) Blob storage currently allows blobs to be stored in the root container. This feature was specifically included so that Silverlight applications running from blob storage could place a cross domain policy file at the root of the URL namespace (a requirement for cross domain policy files). SUMMARY While not extensive, this article gave you a brief overview of some key principles to keep in mind when architecting applications to run on the Windows Azure Platform. By following these guidelines you should be able to achieve core objectives of scalability and cost recovery in the cloud. 23
  • 24. The Windows Azure Platform: Articles from the Trenches THE WINDOWS AZURE PLATFORM AND COST-ORIENTED ARCHITECTURE By Marcus Tillett COST IS IMPORTANT Cost-orientated development is nothing new. A low cost approach to building an application or product is desirable but the methodology used to achieve this is not always very sophisticated. When considering a cloud platform such as Azure the cost implications of the chosen architecture can be significant and require a more sophisticated approach. While a traditional on premise or hosted architecture may not consider cost as a significant factor, cost is an area that receives significantly more focus for Azure. There are a range of costs that need to be considered; these costs need to be considered in the context of Azure and of the end to end development and application lifecycle management processes. WHAT COSTS TO CONSIDER The development process can be a significant cost consideration for Azure. There is a continuum of development Summary strategies for Azure; from, at one extreme, using the Azure  Cost is much more of an environment for development, to the other extreme, architectural consideration for developing without any reference to Azure. There are cost Azure than for a traditional on implications and significant other pros and cons across this premise or hosted solution. continuum. As an example, consider the use of software  Cost implications of the chosen factories. With a software factory that uses a strict assembly architecture can be significant.  Costs should be considered in the process, the cost of using the production platform may be context of the end to end prohibitive due to the expense of both the platform and development and application training required. These concerns would drive a cost-oriented lifecycle management processes. architecture where all Azure specific components are  Model costs for the chosen abstracted from the developer or potentially replaced with architecture but most importantly non-Azure components. While this may be an extreme test the model. example, it does highlight one of several areas to be considered. Another significant topic is the methodology applied to the migration of an existing application or the consideration for setting up data required by a new application. Migration and set up need to include both the application and the data. The time, processes and procedures needed to transfer large volumes of data or complex data, in particular, may be a significant undertaking. With the potential complexity of managing changes to a live data source, the total business cost of the chosen approach can be a critical factor. The cost implication of the platform itself is, perhaps, the obvious area to necessitate a cost-oriented architecture. It is natural to be drawn to, for instance, the dramatic price difference for data storage between SQL Azure and Windows Azure storage. While this may be critical to some applications, it is better on balance to construct a solid architecture as this will provide the best long term approach than initially focusing on cost. This should be supported by modelling the costs of all the components 24
  • 25. The Windows Azure Platform: Articles from the Trenches of the application. However, it is even more important to test this model for the most cost critical aspects of the application. Thereby providing an understanding of how the application design and the charging mechanisms of Azure impact the cost model. With this information the architecture can be reviewed for significant cost savings. For any aspects that are cost critical, monitoring should be included in the final application and used to tune the system while ensuring that the evolution of Azure and the application are analysed for significant cost implications. Indeed monitoring the whole system as a means to verify costs and SLA is another architectural consideration. As a way to augment the full cost modelling process, there are some scenarios where the cost of the platform suggests a cost-orientated architecture. One of these is multi-tenanting of an application where there are high tenant numbers. A basic on premise or hosted server model with a pair of servers can enable the creation of a separate IIS web site and SQL Server database for each tenant. This model supports 10’s or perhaps 100’s of tenants for near same cost as a single tenant. Translated the same architecture to Azure might consist a Windows Azure Web Role and a 1GB SQL Azure database. This would equate to an approximately monthly cost of US $100 per tenant but the cost of this Azure architecture scales linearly with tenant numbers. This is not to state that Azure is not suitable for multi-tenanted applications, but that where cost is a critical factor for the application a different architectural approach may be required. CONCLUSION Whether the considerations described here could be termed cost-driven8 or cost-oriented architecture9,10, the terminology is less important than the realisation that cost is much more of an architectural consideration for Azure than for a traditional on premise or hosted solution. 8 Lessons Learned: Building Multi-Tenant Applications with the Windows Azure Platform 9 Thinking of... Delivering Solutions on the Windows Azure Platform? Delivering-Solutions-Platform-Questions/dp/0956155634/ 10 Windows Azure Platform for Enterprises 25
  • 26. The Windows Azure Platform: Articles from the Trenches DE-RISKING YOUR FIRST WINDOWS AZURE PROJECT By Simon Munro Developer enthusiasm for building solutions based on Azure is not always shared by business. While it is great (and perhaps obvious to us) that the cloud is ‘the way of the future’ some individuals and organizations and vendors are ready for the change while others are not. Not all vendors have technologies for the cloud and many businesses, products, industries and jobs will go as the cloud wave washes them out to sea. Vendors are scrambling for attention and pushing their biased marketing oriented opinions through the biggest dinosaurs of all – the print media, that culturally could not even cope with the changes brought on by the Internet. Most anti-cloud and vendor bashing opinion plays on fear and its business cousin risk, where the urge is to maintain the status quo in our (currently) risk-averse environment. It is unsurprising then that the people that we need to make decisions about cloud computing in our own organizations are confused, wary and reluctant to make a commitment to our latest idea of running our solution on Azure. The term ‘the cloud’ has become synonymous with ‘the web’ and is indistinct from ‘cloud computing’ platforms that we are interested in – the unfortunate side effect being that the behaviour of Google, Facebook, Apple and other web-consumer facing properties that willy-nilly change terms of service and sell personal data for profit casts a shadow over business oriented cloud computing services. While the dust may settle at some point in future, if we want to build a solution on Azure any time soon, we will have to take responsibility for helping business understand the issues in order to gain their support. While we may prefer to deal only with technical issues, the current reality is that in most environments we have to proactively discuss the perceived risks and demonstrate that we, as well as the Microsoft and the Windows Azure platform, are actively managing and reducing risk. POPULAR RISKS Risks to data are by far the most publicised because once data is in databases that are outside of an organization’s locked-down data centre a degree of control and authority over the data is lost. Unlike students that go and live in co-ed dorms, data does not get drunk and put pictures of itself up on Facebook when it leaves home, but the suspicion still remains that off premise data is a high risk. While the risk to data may increase, the actual risk, in most cases, is greatly exaggerated and manageable. Process related risks are also well known, centred on the involvement of other parties in the operational aspects of the solution. No longer can business dictate service levels or even have confidence in an external supplier of services that they may have had with their own internal IT. Like with data, there are real issues here that have fairly complex contractual ramifications as customers attempt to reduce vendor lock-in, guarantee service levels and maintain operational, security, performance and other standards. COVERT RISKS While mainstream CIO information sources popularise some risks by extensive coverage, there are many risks that are just as real but less well known, often due to their more technical nature. 26
  • 27. The Windows Azure Platform: Articles from the Trenches The most obvious is the lack of skills and experience in creating secure, reliable and performant cloud computing solutions. This also related to the problem of development engineering costs that could be higher than simply throwing hardware at performance bottlenecks. Even Microsoft, as our trusted provider of platforms and tools still has risks embedded within Azure. The lack of on-premise alternatives to cloud technologies such as Azure tables and queues makes the commitment to the platform quite high (a kind of vendor lock-in) and the tooling is still immature and unable to easily support accepted engineering practices such as continuous integration (see ‘Using a CI build to achieve an automated deployment of your latest build’ by Grace Mollison) . NON-TECHNICAL TACTICS FOR REDUCING RISK While ultimate responsibility for managing risk falls to project managers and other people within the organization, the identification of risk still remains the responsibility of everybody on the team. By downloading this book you have more knowledge of cloud computing than many of your co- workers, so before getting into the technical aspects, you will need to shoulder additional responsibility and deal with some aspects of reducing risk that do not involve code. CHOOSE THE CORRECT APPLICATION Choose something simple that is better suited to cloud computing, such as one that is public facing and may have demand peaks. Build on those successes before tackling applications that contain sensitive data, integrate with a lot of other systems, are a migration of an existing legacy system or contain a lot of traditional database storage and reporting. ENGAGE EARLY Even if your project is a low profile skunk works development, you need to engage with legal, compliance, operations, finance, audit and other parts of the business sooner than usual. Normally we would not worry about throwing up a new website onto our existing data centre, but if you surprise people with a rogue cloud computing application it may get shot down. UNDERSTAND THE PRICING AND OPERATIONAL MODEL As much as it may look simple on the surface, digging deeper into the pricing, billing, SLA’s and related aspects of cloud computing platforms can become complicated, with broad reaching impacts on legal positions, compliance and interdepartmental feuds. You have to at least put the Azure prices in a spreadsheet with your estimated requirements and put an annotated printout of the SLA in your project sponsor’s hands. UNDERSTAND THE IMPLICATIONS While it may be unnecessary to do a full threat model, you need to understand the possible financial, reputational and other risks if your application is compromised or the data gets lost. 27
  • 28. The Windows Azure Platform: Articles from the Trenches Understanding the effects of loss should influence your approach to what data is stored on the cloud, for how long and whether it is moved to on-premise storage. FAMILIARISE YOURSELF WITH ON-PREMISE RISKS Because cloud computing is seen to have security risks, the focus on security often means that the solution is more secure than the on-premise counterparts. Whenever defending the risks of cloud computing make sure that you compare them to the existing everyday risks of the existing on- premise platform. Not all solutions, networks and other infrastructure can actually deliver the availability and security that they promise. UNDERSTAND THE APPETITE FOR RISK Culturally, startups can absorb cloud computing risks as part of their overall risk exposure compared more risk averse organizations such as banks that are, at least this year, less likely to absorb additional risk. More mature organizations have processes and committees for managing risks and, although it may ultimately be the project sponsor’s responsibility, you need to get a feel for the ability of the organization to take on risk before you pitch your big idea. TECHNICAL TACTICS FOR REDUCING RISK HOW EXTREME? Microsoft has made it quite simple to take a good ‘ol ASP.NET web application with an underlying SQL database and throw it up onto the Azure cloud with minimal changes. On the other hand, building a well architected solution that has been optimised for a cloud computing environment is more difficult, involved and risky. If your system is being built within a risk averse environment and does not need to be built for the cloud, forgo Azure storage, worker processes, federated identity management and other cloud specific technologies and build a simple solution with web roles and a SQL database. Azure will support you well whichever approach you choose, but you need figure out how much on the fancy new stuff you really need and make those decisions early. DEFINE THE APPROACH TO DATA When it comes to cloud computing risks, data is the most sensitive and active topic and it needs to be addressed early on in the solution design. Fortunately SQL Azure addresses many of the concerns and risks around the NoSQL-like Azure tables by providing a familiar database platform if such familiarity is required, but ultimately Azure storage, caching and other technologies need to be considered in any good Azure architecture. Whatever the bias for storage in the Azure cloud, there is still the issue that the data is in the cloud and it needs to be dealt with in your architecture. There may be a requirement to move or copy data from Azure to an on premise database for reporting, integration with other systems or even just the feeling that the data is safer. MANAGE THE ENGINEERING COST 28
  • 29. The Windows Azure Platform: Articles from the Trenches Unless you have built a reasonable sized application on Azure and deployed it in a live environment there are going to be unforeseen technical challenges that will present themselves. By reading this book you are clearly on the right track and trying to learn from the experiences of others, but you need to do a lot more than just read or learn on the job. You need to install the tools, write code, deploy, put it under load, scale up, scale down, debug, diagnose and try out a lot of unfamiliar patterns and technologies just to reduce the impact of unforeseen quirks. IMPLEMENT WITH GOOD ENGINEERING PRACTICES The future of your first Azure application is fairly unsure – cast your mind out two years and you cannot be sure that your architectural choices were correct, technical components have been added or abandoned, regulations have changed or the attitudes of your organization towards cloud computing have altered. The concerns raised by the software craftsmanship movement of maintainability, testability, extensibility are amplified in such an environment which is years from settling down. The Azure combination of a well established platform in the .NET ecosystem and some new technologies, approaches and thinking thrown in means that we have both the need and the frameworks to craft solutions properly to reduce the risk that we are exposed to. Testability, inversion of control, loose coupling and other software craftsmanship techniques are well supported, understood and debated on the .NET platform and are therefore (reasonably) portable onto Azure. You need to hone these skills as single layered, monolithic architectures that seem easy at first and are encouraged by Microsoft marketing and tooling will result in an approach with high and unnecessary risk in an already risky space. DEVELOPER RESPONSIBILITY While technologists may be excited at the technical opportunities of cloud computing, business and other decision makers are probably more wary of the cloud than any other (recent) computing technology shift. They are reading conflicting messages by vendor marketers and self proclaimed cloud experts while their own staff are both protecting existing jobs and whispering discord in the passageways. So while risk management and selling of architectures may not be amongst the most exercised developer skills, cloud computing requires that we take cloud computing to the business and take some responsibility for allaying fears. 29
  • 30. The Windows Azure Platform: Articles from the Trenches TRIALS & TRIBULATION S OF WORKING WITH AZ URE WHEN THERE’S MORE THAN ONE OF YOU By Grace Mollison I had enormous fun working on an Azure project See the Difference that took 7 weeks from start of development to handing over to the client The technology stack used was: Windows Azure hosting, Windows Azure Storage, SQL Azure, ASP.Net MVC, N2CMS, Spark View Engine, Castle Windsor, xVal, PostSharp There was one bug bear in that the Azure development experience is NOT designed for a team of developers and I needed to get that sorted out. So where did I start? With a list of course. Here were the big ticket items:  The ability to set up three environments Development, Testing and UAT. Testing and UAT to be accessible by all members of the team  Shared access to the hosted environment  Automated deployments to the cloud as part of a CI build. After all no self-respecting development team doesn’t have a continuous integrated build do they? DEVELOPMENT ENVIRONMENT For the development environment we stuck to Visual studio 2008 SP1. Visual Studio 2010 was in beta2/ RC when we undertook the development but with all the potential unknowns with Azure that was a step too far. The Azure developer tools were installed on each developer workstation and the Azure SDK on the build server. There was an upgrade to the Azure SDK during the development cycle which the development team said was needed which meant updating the various machines that constituted the environment manually ( Alas no WSUS  ) . Fortunately this only happened once during the development cycle. In addition to Visual studio we also supplemented the development environments with a few extra tools that provided a more complete development experience. TEST ENVIRONMENT The Test environment proved to be more challenging. The most pragmatic way to sort it out was to provision another development work station running the development fabric. But (yes I know there’s always a “but”) the Development fabric runs against the local loopback address. To get round this a SSH tunnel had to be set up between the target machine and the Client machines that needed to access it. Alas this proved to be slightly less than user friendly plus the fact the random allocation of ports for the local storage fabric had to be resolved after each new deployment made it basically unworkable. The differences between the Development fabric and Azure fabric was also impacting the team deliverables as we ended up seeing differences in behavior or could only test certain functionality in the staging environment. We resorted to using Azure Staging as our Test environment. 30
  • 31. The Windows Azure Platform: Articles from the Trenches I was anticipating an easy ride from here on but.... yes it’s another of those “Buts”. CERTIFICATES The team members needed to either use their own self signed certificate or to use a certificate I generated which is then uploaded onto Azure. As the team was small and fluid the decision went with using one I generated. This turned out to be a good call as we did have problems with certificate connections apparently timing out after some time for some team members for no obvious reason. Because there was only one certificate to worry about it was relatively painless to resolve the problems around the use of this. It is bad practise to share certificates in this way but pragmatism was the order of the day. For a larger team with a longer development cycle I would advocate each developer using a personal certificate which can then be easily revoked. One thing we quickly learnt was that in the early days of development, suspending, then deleting was the safest approach to deploying a new package. The small team meant it was easy to communicate the change of URL this caused. WHEN THINGS GO WRONG It’s a fairly nerve racking experience when things go wrong as often you can do nothing but wait for Azure to barf and throw a Dr Watson and there’s no real feedback when Azure tries to spin up the roles. Alas as soon as we got to UAT we then had to give up our staging environment and minimise changes to the Staging URL as both the client and a 3rd party needed to know the URL. The loss of this environment for system testing meant we were forced to press my personal Azure account into service as the Staging environment. We did get the automated deployment in place but it’s a tale too long to describe in this article. SUMMARY The Windows Azure Platform may not be quite ready for team development out of the box but once you understand what needs to be addressed the barriers for team development are easily overcome . You can with a small amount of work up front treat development for the Windows Azure Platform as you would any other application developed using your familiar team development tools. 31
  • 32. The Windows Azure Platform: Articles from the Trenches USING A CONTINUOUS INTEGRATION BUILD TO ACHIEVE AN AUTOMATED DEPLOYMENT OF YOUR LATEST BUILD By Grace Mollison This article assumes familiarity with Team Foundation build and MSBuild concepts such as tasks and properties Setting up a Continuous Integration (CI) build to automatically push a successfully built package directly to Azure cannot be achieved straight out the box but requires some additional work. This article outlines an approach taken whilst delivering the See the Difference project using the Windows Azure Platform. GETTING THE RIGHT “BITS” The first thing that was done was to collate and configure the components that would be needed to allow the build server to access the Target Window Azure portal via a command line. To do this requires using the Azure Service Management API. Using the API requires an x.509 certificate. I created a self-signed one using the makecert tool which is part of the windows SDK. An example on how to do this is shown below: "c:Program FilesMicrosoft SDKsWindowsv6.0Abinmakecert" -r -pe -a sha1 -n "CN=Windows Azure Authentication Certificate" -ss My -len 2048 -sp "Microsoft Enhanced RSA and AES Cryptographic Provider" -sy 24 MySelfSignedCert.cer. The blog post Creating and using Self Signed Certificates for use with Azure Service Management API explains in detail how to configure the certificate on the target Azure portal and the machine that needs to communicate with the portal. I downloaded the Windows Azure Service Management PowerShell CmdLets and also the Windows Azure Service Management API Tool which are both handy for remotely accessing the Azure portal via the Service Management API. At this stage I had no idea which one I would be using. I tried them both as part of a Build and found that I preferred using the service management API tool csmanage (despite being a big fan of Powershell). The blog post referred to above illustrates the use of the x.509 certificate, the API and Powershell to deploy to the Azure staging environment. PACKAGING FOR DEPLOYMENT Next I looked at packaging the application ready for deployment. There are two key things when packaging the application from the command line : 1. Obtain the role types and names as this will be needed to construct the package 2. Make sure the location of the service definition file is known The ServiceDefintion.csdef file contains the role types and names as this is needed to construct the package using the Windows Azure command line tool cspack. Below is a snippet from a ServiceDefintion.csdef file illustrating a simple example with one web role. The number of instances does not matter to cspack : 32
  • 33. The Windows Azure Platform: Articles from the Trenches <ServiceDefinition name="SeeTheDifference.Cloud" xmlns=""> <WebRole name="SeeTheDifference.Web" enableNativeCodeExecution="true"> <InputEndpoints> If cspack is not run from the correct place the package will not be constructed correctly hence why the location of the ServiceDefintion.csdef file is so important. DEPLOYING At this stage I was able to package the application and deploy to the Azure portal via MSBuild. We had concerns with this approach with regards problems with the actual package affecting the deployment. In particular we were concerned about what to do after handover to the client when a little more caution would be called for. A change of plan was decided upon. The new plan was to push the package to blob storage and then the Client would be able to carry out the deployment at their convenience. To push the package to blob storage a C# console application I called LoadBlob was written that could be called from the MSBuild script. This application pushed the package to a pre-determined container. It was decided that storing the configuration (.csfg) file in blob storage was also a good idea as it would reduce the risk of non production configuration settings being used. During testing I was unable to get the service management API to use the stored configuration file. It was only able to use one stored on the local system, but as the end to end deployment process we were implementing actually required a pause for breath before the push to Azure Staging or production this issue did not affect the implementation of the CI build process. Finally after testing all the constituent parts, they were incorporated as part of the CI build. Below is a snippet from a TFSbuild.proj file where I overrode the target AfteDropBuild. The AfterDropBuild task is called after dropping the built binaries and I used it to insert some commands to allow the build to use cspack ( equivalent to zipping the dlls and configuration files ) to package the cloud service package which is then pushed up to blob storage ready for deploying to staging or Production. <PropertyGroup> <PathToAzureTools>c:Program FilesWindows Azure SDKv1.0bincspack.exe</PathToAzureTools> <cPkgCmd>"$(PathToAzureTools)" SeeTheDifference.Cloud.csxServiceDefinition.csdef /role:SeeTheDifference.Web;seeTheDifference.Cloud.csxrolesSeeTheDifference.Webapproot;See TheDifference.Web.dll</cPkgCmd> <LoadblobPath>c:TOOLSAzureDeployment</LoadblobPath> <LoadBlobCmd>$(LoadblobPath)Loadblob.exe </LoadBlobCmd> </PropertyGroup> 33
  • 34. The Windows Azure Platform: Articles from the Trenches <Target Name="AfterDropBuild" DependsOnTargets="DeriveDropLocationUri" Condition=" '$(IsDesktopBuild)'!='true' "> <Message Text=" cspack creating a package for deployment"/> <Exec Command="$(cPkgCmd) /out:c:DropsSD_Deploy$(BuildNumber).cspkg" WorkingDirectory="c:Dropstest$(BuildNumber)ReleaseSeeTheDifference" /> <!-- Load blob to Azure into deployment container set via config file settings target container will be cleared before uploading --> <Message Text =" Copying '$(BuildNumber)'.cspkg to deployment container in Azure " /> <Exec Command ="$(LoadBlobCmd) -upload $(BuildNumber).cspkg" WorkingDirectory="c:DropsSD_Deploy" /> </Target> The screenshots below show the uploaded cskpg in blob storage: The deployment could then be completed by using a user friendly tool like Cerebreta Cloud Storage Studio. 34
  • 35. The Windows Azure Platform: Articles from the Trenches USING JAVA WITH THE WINDOWS AZURE PLATFORM By Rob Blackwell With a name like Windows Azure, you could be forgiven for thinking that Microsoft’s cloud computing offering is a Microsoft-only technology. In fact it has a lot to offer Java developers through its use of open standards and RESTful APIs. ACCESSING WINDOWS AZ URE STORAGE FROM JAVA WindowsAzure4J is an open source library that can be used to access Windows Azure Storage from Java applications, running on Windows Azure or elsewhere. Download the JAR file from . You’ll also need to grab some other dependencies commons-collections-3.2.1.jar commons-logging-1.1.1.jar dom4j-1.6.1.jar httpclient-4.0-beta2.jar httpcore-4.0.jar httpcore-nio-4.0.jar httpmime-4.0-beta2.jar jaxen-1.1.1.jar log4j-1.2.9.jar To get started, you’ll need an account name and account key from the Windows Azure portal. Paste these into the sample code provided with WindowsAzure4j to use Blobs, Queues or Tables.If you are an Eclipse user, you can also install the Windows Azure Tools for Eclipse 35
  • 36. The Windows Azure Platform: Articles from the Trenches The Windows Azure Storage Explore running in Eclipse. RUNNING JAVA CODE ON WINDOWS AZURE If you want to host a Java application in Windows Azure, there are a number of considerations. The first thing to note is that even if your Java application is a Web application you probably won’t want to use an Azure Web Role. The principle difference between web roles and worker roles is whether Internet Information Services (IIS) is included. Most Java developers will want to use a Java specific web server or framework, so it’s usually best to go with a worker role and include your choice of web server within your deployment package. You’ll also need to bootstrap Java from a small .NET program that will essentially invoke the Java runtime through a Process.Start call. Both web roles and worker roles are provisioned behind a load-balancer so either is suitable for hosting web applications. In a worker role you just have to do some additional plumbing to connect up your web server to the appropriate load-balanced Input End Point. So for example, the public facing port 80 of might get mapped to, say port 5100 in your worker role. The following code allows you to determine this port at runtime: RoleEnvironment.CurrentRoleInstance.InstanceEndpoints["Http"].IPEndpoint.Po rt 36
  • 37. The Windows Azure Platform: Articles from the Trenches Fortunately both The Tomcat Solution Accelerator and AzureRunMe handle all of these technicalities for you. The Tomcat Solution Accelerator is a good choice if you have a traditional Java based web application. It supports Java Servlet and Java Server Pages applications, possibly packaged as a WAR file. It can be downloaded from . The accelerator walks you through the process of creating an Azure cloud services package file that contains your application as well as the Tomcat server and Java Runtime. It automatically handles the necessary configuration. Just upload the resulting cspkg file to Windows Azure, wait for it to deploy then bring up your web browser and browse to AZURERUNME AzureRunme ( doesn’t assume any particular web server or framework. In fact you could just run a straightforward command line application with no visible user interface. That said, I’ve used it successfully with both Restlet ( and Jetty ( ). Imagine that you were going to run your application from a USB drive and that you weren’t allowed to install any software onto the machine – you’d have to include the Java Runtime Executive (JRE) , all the library JAR files and any data all in subdirectories of the USB stick. You’d probably create a .BAT file at the top level to run everything. Like this: cd MyApp ..jrebinjava -cp MyApp.jar;lib* Start %1 AzureRunme takes a similar approach – put all these files together in a single ZIP file, upload it to Blob storage. Download AzureRunMe cspkg file and use this to bootstrap your Java code. Notice that the batch file takes a parameter %1 – This is the port that you should use if you want to bring up a web server – the load balancer will direct all HTTP traffic to your application on this port. AzureRunme comes with a Trace listener that uses the Service Bus to relay standard output and any log4j messages back to a command window on your desktop machine. It makes it easy to see trace messages, watch your application’s progress and see any exception messages. 37
  • 38. The Windows Azure Platform: Articles from the Trenches AzureRunMe Trace Listener showing log messages relayed via the AppFabric Service Bus. For more information about Interoperability on the Microsoft platform see 38
  • 39. The Windows Azure Platform: Articles from the Trenches CHAPTER 3: WINDOWS AZURE AUTO-SCALING WINDOWS AZURE COMPUTE INSTANCES By Steven Nagy INTRODUCTION There are many reasons applications need to scale. Some applications have on/off periods of batch processing (for example overnight render farms), some have predictable peak loads (for example share market applications peak during open and close of the market) and some might have unpredictable peak periods (for example your website gets linked by Slashdot). In the case of predictable peak loads we can easily log in to the Windows Azure portal and adjust our configuration file to increase the number of instances of our web and worker roles. However, when application load peaks unexpectedly, we want our applications to respond immediately. For applications with global reach, this might be when we least expect it. Without appropriate monitoring techniques we may not even know the extent to which we are failing to serve requests. On the flip side, we are paying for every CPU core hour we use. Thus we want to be able to scale down instances that are underutilised. We need to know how to auto-scale; our applications need to become smart. A BASIC APPROACH There are a number of jigsaw pieces that need to fit together to build the auto-scaling picture. The first piece is monitoring, which lets us pull information from the roles that need to auto-scale. The next piece is about establishing rules and measuring against thresholds to determine when to scale up and scale down. The third piece establishes trust between the service that is doing the monitoring (referred to from here on as the ‘Scale Agent’), and the roles that are being monitored. Finally, the Scale Agent needs to instruct the Windows Azure Portal to add or remove instances of those roles as it deems necessary. Monitoring Rules Scale Agent Trust Instruct THE SCALE AGENT The Scale Agent is responsible for monitoring your application, applying rules and instructing the API to scale your roles, and can be hosted in different ways. One option is hosting the agent as another process on your existing Azure roles, but a role can have many identical instances, so which instance would it run on? And the agent will take some CPU resources, could that impact on its ability to 39
  • 40. The Windows Azure Platform: Articles from the Trenches assess the other work running on the same role? It makes more sense to move the Scale Agent to a separate location that doesn’t interfere with the standard workload, where its own workload won’t pollute the statistics. The agent can be hosted as another worker role, separate to the main work being done by the application. This worker role would never need to scale, and could be geo-located and co-located near the compute instances that it needs to monitor. This removes external bandwidth costs and allows for faster processing/assessment. You could also host the agent off site completely, perhaps in your own data centre, as a windows service. This means you have more control over the agent, but the agent will be slightly slower communicating to the instances, getting performance counter logs, and issuing scale commands. A dedicated worker role is usually the best option but also the hardest to configure for trust as we’ll see further on. MONITORING: RETRIEVING DIAGNOSTIC INFORMATION Before we can make decisions about scaling, we need to know some simple statistics about the services we want to scale. These statistics in turn let us make informed decisions. Diagnostic helper processes will put performance counter information into table and blob storage, so this will require an Azure Storage project. There are lots of counters to choose from, but we usually want to monitor memory usage, CPU usage, and number of requests per second, and if any one of those exceeds an upper threshold then we want to scale up. The role that needs to auto-scale will be responsible for gathering its own performance information and dumping it into a storage table. This is done via configuration classes, available in Microsoft.WindowsAzure.Diagnostics namespace: var perfConfig = new PerformanceCounterConfiguration(); perfConfig.CounterSpecifier = @"Processor(0)% Processor Time"; perfConfig.SampleRate = TimeSpan.FromSeconds(5); We create a configuration item for a performance counter we want to track – in this example we want information about CPU utilisation. The average utilisation will be gathered over 5 second intervals. var config = DiagnosticMonitor.GetDefaultInitialConfiguration(); config.PerformanceCounters.DataSources.Add(perfConfig); config.PerformanceCounters.ScheduledTransferPeriod = TimeSpan.FromMinutes(1); DiagnosticMonitor.Start("DiagnosticsConnectionString", config); 40
  • 41. The Windows Azure Platform: Articles from the Trenches We then add the performance counter to the list of items we want the DiagnosticMonitor to track for us. The DiagnosticMonitor runs in a separate process on the virtual machine instance so it won’t interfere with our normal application code. Every minute new performance counter information will be written back to a storage account as specified in the ‘DiagnosticsConnectionString’, into a table called ‘WADPerformanceCountersTable’. We can verify the counter information made it into the table using 3rd party tools You can see that the table has an entity which has a property called ‘CounterValue’ which contains our CPU utilisation. I won’t go into the code required to view an entity in table storage; this is very well documented already11. Your Scale Agent will retrieve these values by polling the table occasionally and keeping track of the utilisation, scaling when needed. RULES: ESTABLISHING WHEN TO SCALE 11 41
  • 42. The Windows Azure Platform: Articles from the Trenches The Scale Agent now knows what levels your various role instances are at based on the performance counter information. However deciding when to scale up/down is difficult and can easily become an exercise in advanced mathematics. Although the rules are different for every application, here are some common issues to consider:  You usually need a certain amount of head room, in case you get a sudden spike in load before your Scale Agent can spin up more instances  Immediately after scaling up, your original instances might still be over the threshold – prevent your agent from scaling up again immediately until enough time has passed that you can be positive that more scale is needed  Aggregate your usage from all instances – if a single instance is spiking but the rest are under normal load, you don’t really need to scale  If you do need more instances, scale up based on how many instances you currently have. For example, if you only have 5 instances, you might want to add 2 more (40% increase) before checking again. If you have 50 you may only want to add 10 (20% increase)  Try to predict load based on patterns of behaviour. For instance, if over the last 15 minutes you’ve been steadily climbing by 5% utilisation per minute, you can predict that you will probably go over your threshold in X number of minutes. Why wait until you are over loaded and losing connections before scaling? Analysing these kinds of patterns can let you scale up “just in time”  Predictive patterns can get very complicated – if at 4pm every day you seem to have additional load, prepare in advance for scale rather than waiting for auto-scale to kick in  Keep in mind that long running requests can provide false positives – if all web threads are used for an instance but all those threads are held up in IO requests, you will still have low CPU utilisation, so consider a range of performance counters specific to your type of application and architecture  Hard limits – If your average is 3 instances, would you want your application to be allowed to auto-scale up to 500 instances? That’s probably not a credit card bill you want to receive, so consider imposing some hard limits to scale, or provide some reasonable alerting (SMS, email, etc) so that if your app DOES scale to 500, you can find out immediately and hop online to see why TRUST: AUTHORISING FOR SCALE There is a rich management API that can be used to control your Windows Azure projects, however in order to issue commands there needs to be trust between the Scale Agent and the API of the account hosting the roles – this trust is established via X509 certificates. Generating certificates is also well documented. Once created, we need to provide our certificate in 3 places:  The Windows Azure Account – for the Service Management API to check requests against  The virtual machine issuing commands – in our case, where the Scale Agent is hosted  The service configuration and definition for our Scale Agent project 42
  • 43. The Windows Azure Platform: Articles from the Trenches In the Windows Azure portal for the account you wish to manage, there is an ‘Account’ tab where you can upload DER encoded certificates with a .CER extension: You must also upload the certificate in the Personal Information Exchange format with a .PFX extension and the matching password to your service project so that the certificate becomes available to any virtual machine instance provisioned from that entire project. This can be found under the Certificates section of your service deployment: Click on ‘Manage’ and upload the .PFX version of your certificate. It is important to note that this is not installing the certificate to the role instances under this service. Instead it is making the certificate available to any role that requests it. To make that request we have to complete the third step and tell our Scale Agent role that it will require that certificate. 43
  • 44. The Windows Azure Platform: Articles from the Trenches While it is possible to enter the required XML manually, it is much easier to use the property pages instead. For the role that needs the certificate (i.e. your Scale Agent role) find it in your Cloud Service project, right click and select properties. In the property pages, find the Certificates tab on the left. Select ‘Add Certificate’ from the top and enter the details. The important part here is finding your certificate under the right Store Location and Name. This screen presumes the certificate is installed locally as it uses local machine stores to search for it. If you don’t have it installed locally, you can just paste in the thumbprint manually. That wraps up all 3 parts of the certificate process. When your role is deployed to Windows Azure, it will ask for the certificate with that thumbprint to be installed into the virtual machine. SCALING – THE SERVICE MANAGEMENT API We know we need to scale, we have established trust, all we need to do is issue the command: scale! All API calls are RESTful, but there is no API that exists solely for scaling up and down. Instead this is done through the service configuration file, which is maintained separately from the service deployment. You can at any time go and change the configuration for your deployment through the portal, and the API is just an extension of this functionality. The steps required are: 1. Request the configuration file for a service deployment 2. Find the XML element for the instance count on the role you are scaling 3. Make the change 4. Post the configuration file back to the service API If you don’t want to manually manipulate the REST API yourself, Microsoft has posted code samples to assist you, including samples on scale12 and services management API 13. 12 13 44
  • 45. The Windows Azure Platform: Articles from the Trenches SUMMARY This short article provides you with the theory to scale up your applications reactively. Scheduled scale up/down can also be automated with the same technique defined above but instead of scaling reactively, you can also scale proactively. While this article has presented just one way of scaling automatically, there are other derivatives and approaches you could follow. For example, the Scale Agent could pull diagnostic information from the roles via the Diagnostic Manager classes, rather than the roles pushing that information. Open source framework Lokad.Cloud14 takes another approach by allowing roles to auto-scale themselves. Find the approach that’s right for you and capitalise on economies of scale today! 14 45
  • 46. The Windows Azure Platform: Articles from the Trenches BUILDING A CONTENT-BASED ROUTER SERVICE ON WINDOWS AZURE By Josh Tucholski Some applications, depending on their nature, require priority processing based on request content. It is typical in these scenarios to develop an application layer to route requests from the client to a specific business component for further processing. Implementing this in Windows Azure is not straightforward due to its built-in load balancer. The Windows Azure load balancer only exposes a single external endpoint that clients interact with; therefore it is necessary to know the unique IP address of the instance that will be performing the work. IP addresses are discoverable via the Windows Azure API when marked as internal (configured through the web role’s properties). While this tutorial may seem more of an exercise on WCF than on Windows Azure, it is important to understand how to perform inter-role communication without the use of queues. In order to filter requests by content, an internal LoadBalancer class is created. This class ensures requests are routed to live endpoints and not dead nodes. The LoadBalancer will need to account for endpoint failure and guarantee graceful recovery by refreshing its routing table and passing requests to other nodes capable of processing. Following is the class definition for the LoadBalancer to detect endpoints and recover from unexpected failures that occur. public class LoadBalancer { public LoadBalancer() { if (IsRoutingTableOutOfDate()) { RefreshRoutingTable(); } } private bool IsRoutingTableOutOfDate() { //Retrieve all of the instances of the Worker Role var roleInstances = RoleEnvironment.Roles["WorkerName"].Instances; //Check current amount of instances and confirm sync with the LoadBalancer’s //record if (roleInstances.Count() != CurrentRouters.Count()) { return true; 46
  • 47. The Windows Azure Platform: Articles from the Trenches } foreach (RoleInstance roleInstance in roleInstances) { var endpoint = roleInstance.InstanceEndpoints["WorkerEndpoint"]; var ipAddress = endpoint.IPEndpoint; if (!IsEndpointRegistered(ipAddress)) { return true; } } return false; } private void RefreshRoutingTable() { var currentInstances = RoleEnvironment.Roles["WorkerName"].Instances; RemoveStaleEndpoints(currentInstances); AddMissingEndpoints(currentInstances); } private void AddMissingEndpoints(ReadOnlyCollection<RoleInstance> currentInstances) { foreach (var instance in currentInstances) { if (!IsEndpointRegistered(instance.InstanceEndpoints["WorkerEndpoint"].IPEndpoint )) { //add to the collection of endpoints the LoadBalancer is aware of } } } private void RemoveStaleEndpoints(ReadOnlyCollection<RoleInstance> currentInstances) { //reverse-loop so we can remove from the collection as we iterate for (int index = CurrentRouters.Count() - 1; index >= 0; index--) { bool found = false; foreach (var instance in currentInstances) { //determine if IP address already exists set found to true } if (!found) { //remove from collection of endpoints LoadBalancer is aware of } } } private bool IsEndpointRegistered(IPEndpoint ipEndpoint) { foreach (var routerEndpoint in CurrentRouters) { if (routerEndpoint.IpAddress == ipEndpoint.ToString()) { return true; } } return false; } public string GetWorkerIPAddressForContent(string contentId) { //Custom logic to determine an IP Address from one of the CurrentRouters //that the load balancer is aware of } 47
  • 48. The Windows Azure Platform: Articles from the Trenches } The LoadBalancer is capable of auto-detecting endpoints and the remaining work for the router service is WCF. A router, by definition, must be capable of accepting and forwarding any inbound request. The IRouterServiceContract will accept all requests with the base-level message class and handle and reply to all actions. Its interface is as follows: [ServiceContract(Namespace = "", Name = "RouterServiceContract")] public partial interface IRouterServiceContract { [OperationContract(Action = "*", ReplyAction = "*")] Message ProcessMessage(Message requestMessage); } The implementation of the IRouterServiceContract will use the MessageBuffer class to create a copy of the request message for further inspection (e.g. who the sender is or determining if there is a priority associated with it). GetWorkerIPAddressForContent on the LoadBalancer is invoked and a target endpoint is requested. Once the router has an endpoint, a ChannelFactory is initialized to create a connection to the endpoint and the generic ProcessMessage method is invoked. Ultimately the endpoint that the router forwards requests to will have a detailed service contract capable of completing the message processing. public partial class RouterService : IRouterServiceContract { private readonly LoadBalancer loadBalancer; public RouterService() { loadBalancer = new LoadBalancer(); } public Message ProcessMessage(Message requestMessage) { //Create a MessageBuffer to attain a copy of the request message for inspection string ipAddress = loadBalancer.GetWorkerIPAddressForContent("content"); string serviceAddress = String.Format("http://{0}/Endpoint.svc/EndpointBasic", ipAddress); using (var factory = new ChannelFactory<IRouterServiceContract>(new BasicHttpBinding("binding"))) { IRouterServiceContract proxy = factory.CreateChannel(new EndpointAddress(serviceAddress)); using (proxy as IDisposable) { return proxy.ProcessMessage(requestMessageCopy); } } } } Detecting and ensuring that the endpoints are active is half the battle. The other half is determining what partitioning scheme effectively works when filtering requests to the correct endpoint. You may decide to implement some way of consistently ensuring a client’s requests are processed by the same back-end component or route based on message priority. The approach outlined above also attempts to accommodate for any disaster-related scenarios so that an uninterrupted experience can be provided to the client. If one of the back-end components happens to shut down due to a hardware failure, the load balancer implementation will ensure that there is another endpoint available for processing. 48
  • 49. The Windows Azure Platform: Articles from the Trenches BING MAPS TILE SERVERS USING AZURE BLOB STORAGE By Steve Towler Back in early 2009, I was assigned to a project where I was required to build an informational mapping solution for a customer’s website. This mapping solution served custom tiles of the UK which were specially commissioned for the project. Although the map only covered the UK and we had restricted the zoom levels between 6 and 11, each set of tiles (and there were twelve sets) had around 4500 tiles and averaged 80 megabytes in size. Less than 1 gigabyte of tiles may seem like a trivial figure in terms of the vast amounts of storage we have at our disposal nowadays. But what if things had been different? What if the customer wanted to cover Europe or even more zoom levels? What would be the bandwidth implications and the potential costs associated with huge demand for the map? With Windows Azure now “live”, had the same project landed on my desk today I would be looking to serve the map tiles differently as Blob storage is ideally suited to such a task. Storage is infinitely scalable, cheap and its RESTful interface makes requesting the tiles clean and simple. Setting up a Bing Map tile server using Windows Azure Blob storage is surprisingly easy and you can have your own tile server up and running in a few small steps. First things first, you need to crunch your tiles. This is the process whereby you take you custom map images and cut them up into tiles, ready to be used within your mapping application. There are plenty of tutorials on how to do this out on the web and Microsoft MapCruncher is a preferred tool for carrying this task out. Now that you have your “crunched tiles” and you have saved them off to a directory on your local machine, the next step is to get your tiles up into the cloud. For ease I am going to use CloudXplorer, one of the many Windows Azure storage management tools available on the web Using CloudXplorer, create a public container in blob storage called tiles. Now copy all of your “crunched tiles” from your local machine up to your newly created container in blob storage. 49
  • 50. The Windows Azure Platform: Articles from the Trenches Once complete, your tiles should be publically available using a URL like: (or if you are using local development storage) You will now be able to consume the tiles from your tile server using Bing Maps. MSDN includes a piece of code (select JScript tab) which shows you how to add your own custom tile layer to a Bing map. You can tweak that code to suit your own requirements but the important thing to remember is to change the VETileSourceSpecification path to point to your new tile server: var tileSourceSpec = new VETileSourceSpecification("lidar", ""); The project I mentioned at the very beginning of this article was a success and a happy customer is actively informing their potential customers as to their presence in the UK. Had Windows Azure been out of CTP, how differently would the project have turned out? The software consuming the tiles would have been the same but the infrastructure serving the tiles would most certainly have been in the cloud. 50
  • 51. The Windows Azure Platform: Articles from the Trenches AZURE DRIVE By Neil Mackenzie Azure Drive is a feature of Windows Azure providing access to data contained in an NTFS-formatted virtual hard disk (VHD) persisted as a page blob in Azure Storage. A single Azure instance can mount a page blob for read/write access as an Azure Drive. However, multiple Azure instances can mount a snapshot of a page blob for read-only access as an Azure Drive. The Azure Storage blob lease facility is used to prevent more than one instance at a time mounting the page blob as an Azure Drive. It is not possible to mount an Azure Drive in an application not resident in the Azure cloud or development fabric. An appropriately created and formatted VHD can be uploaded into a page blob from where it can be mounted as an Azure Drive by an instance of an Azure Service. Similarly, the page blob can be downloaded and attached as a VHD in a local system. The Azure SDK provides three classes in the Microsoft.WindowsAzure.StorageClient namespace to support Azure Drives:  CloudDrive  CloudDriveException  CloudStorageAccountCloudDriveExtensions CloudDrive is a small class providing the core Azure Drive functionality. CloudDriveException allows Azure Drive errors to be caught. CloudStorageAccountCloudDriveExtensions, similar to the CloudStorageAccountStorageClientExtensions class, provides an extension method to CloudStorageAccount allowing a CloudDrive object to be constructed. GUEST OS Azure Drive requires that the osVersion attribute in the Service Configuration file be set to WA- GUEST-OS-1.1_201001-01 or a later version. For example: <ServiceConfiguration serviceName="CloudDriveExample" xmlns= "" osVersion="WA-GUEST-OS-1.1_201001-01"> VHD The VHD for an Azure Drive must be a fixed hard disk image formatted as a single NTFS volume. It must be between 16MB and 1TB in size. A VHD is a single file comprising a data portion followed by a 512 byte footer. For example, a nominally 16MB VHD occupies 0x1000200 bytes comprising 0x1000000 bytes of data and 0x200 footer bytes. When uploading a VHD it is important to remember to upload the footer. Furthermore, since pages of a page blob are initialized to 0 it is not necessary to upload pages in which all the bytes are 0. This could save a significant amount of time when uploading a large VHD. The Disk Management component of the Windows Server Manager can be used to create and format a VHD. 51
  • 52. The Windows Azure Platform: Articles from the Trenches CLOUDDRIVE A CloudDrive object can be created using either a constructor or the CreateCloudDrive extension method to CloudStorageAccount. For example, the following creates a CloudDrive object for the VHD contained in the page blob resource identified by the URI in cloudDriveUri: CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse( RoleEnvironment.GetConfigurationSettingValue("DataConnectionString")); CloudDrive cloudDrive = cloudStorageAccount.CreateCloudDrive(cloudDriveUri.AbsoluteUri); Note that this creates an in-memory representation of the Azure Drive which still needs to be mounted before it can be used. Create() physically creates a VHD of the specified size and stores it as page blob. Note that Microsoft charges only for initialized pages of a page blob so there should only be a minimal charge for an empty VHD page blob even when the VHD is nominally of a large size. The Delete() method can be used to delete the VHD page blob from Azure Storage. Snapshot() makes a snapshot of the VHD page blob containing the VHD while CopyTo() makes a physical copy of it at the specified URL. A VHD page blob must be mounted on an Azure instance to make its contents accessible. A VHD page blob can be mounted on only one instance at a time. However, a VHD snapshot can be mounted as a read-only drive on an unlimited number of instances simultaneously. A snapshot therefore provides a convenient way to share large amounts of information among several instances. For example, one instance could have write access to a VHD page blob while other instances have read-only access to snapshots of it – including snapshots made periodically to ensure the instances have up-to-date data. Before a VHD page blob can be mounted it is necessary to allocate some read cache space in the local storage of the instance. This is required even if caching is not going to be used. InitializeCache() must be invoked to initialize the cache with a specific size and location. The following shows the Azure Drive cache being initialized to the maximum size of the local storage named CloudDrives: public static void InitializeCache() { LocalResource localCache = RoleEnvironment.GetLocalResource("CloudDrives"); Char[] backSlash = { '' }; String localCachePath = localCache.RootPath.TrimEnd(backSlash); CloudDrive.InitializeCache(localCachePath, localCache.MaximumSizeInMegabytes); } The tweak in which trailing back slashes are removed from the path to the cache is a workaround for a bug in the Storage Client library. An instance mounts a writeable Azure Drive by invoking Mount() on a VHD page blob. The Azure Storage Service uses the page blob leasing functionality to guarantee exclusive access to the VHD page blob. An instance mounts a read-only Azure Drive by invoking Mount() on a VHD snapshot. Since it is read-only, multiple instances can mount the VHD snapshot simultaneously. An instance 52
  • 53. The Windows Azure Platform: Articles from the Trenches invokes the Unmount() method to release the Azure Drive and, for VHD page blobs, allow other instances to mount the blob for write access. The cacheSize parameter to Mount() specifies how much of the cache is dedicated to this Azure Drive. The cacheSize should be set to 0 if caching is not desired for the drive. Different Azure Drives mounted on the same instance can specify different cache sizes and care must be taken that the total cache size allocated for the drives does not exceed the amount of cache available in local storage. The options parameter takes an DriveMountOptions flag enumeration that can be used to force the mounting of a drive – for example, when an instance has crashed while holding the lease to a VHD page blob – or to fix the file system. Mount() returns the drive letter, or LocalPath, to the Azure Drive - for example, "d:" - which can be used to access any path on the drive. The following example shows an Azure Drive being mounted from a VHD page blob specified by cloudDriveUri, before being used and then unmounted: public void WriteToDrive( Uri cloudDriveUri ) { CloudStorageAccount cloudStorageAccount = CloudStorageAccount.FromConfigurationSetting("DataConnectionString"); CloudDrive cloudDrive = cloudStorageAccount.CreateCloudDrive(cloudDriveUri.AbsoluteUri); String driveLetter = cloudDrive.Mount(CacheSizeInMegabytes, DriveMountOptions.None); String path = String.Format("{0}Pippo.txt", driveLetter); FileStream fileStream = new FileStream(path, FileMode.OpenOrCreate); StreamWriter streamWriter = new StreamWriter(fileStream); streamWriter.Write("that you have but slumbered here"); streamWriter.Close(); cloudDrive.Unmount(); } GetMountedDrives() provides access to a list of drive letters for all Azure Drives mounted in the instance. The DriveInfo class can be used to retrieve information about a mounted Azure Drive. The entry point to this information is the static method DriveInfo.GetDrives() which returns an array of DriveInfo objects representing all mounted drives on the instance. DEVELOPMENT ENVIRONMENT The Development Environment simulates Azure Drives in a manner that differs from their implementation in the cloud. Furthermore, the Development Storage simulation is unaware of the Azure Drive simulation with the consequence that the standard blob manipulation methods in the Storage Client API do not work with the VHD page blobs and VHD snapshots used by the Azure Drive simulation. Instead, the blob management methods in the CloudDrive class must be used. 53
  • 54. The Windows Azure Platform: Articles from the Trenches The Azure Drive simulation does not mount Azure Drives from VHD page blobs but through the use of subst against a folder (e.g. drivename) in a subfolder (e.g. drivecontainername) of a well-known directory: %LOCALAPPDATA%dftmpwadddevstoreaccount1 The full path to the folder that is the subst representation of the Azure Drive is: %LOCALAPPDATA%dftmpwadddevstoreaccount1drivecontainernamedrivename Invoking CloudDrive.Create() or CloudDrive.Snapshot() causes a folder with the name of the VHD page blob or VHD snapshot to be created in this directory. CloudDrive.Delete() can be used to delete the VHD page blob or VHD snapshot. Note that, although visible in the Azure fabric, VHD page blobs and VHD snapshots do not appear in blob listings in Development Storage because they are not stored as blobs. Consequently, a VHD file uploaded to Development Storage cannot be mounted as an Azure Drive. The workaround is to attach the VHD to an empty folder in the well-known directory from where it can be mounted exactly as it would be in the cloud. The Disk Management component of the Windows Server Manager is used to attach a VHD to an empty folder (e.g. drivename) in a subdirectory (e.g. drivecontainername) of the well known directory so that the VHD can be mounted precisely as it would be in the cloud. The Azure Drive API can then be used to mount the Azure Drive as if it were backed by a VHD page blob named, drivename, located in a container named drivecontainername. There is no need to invoke the Create() method. There is no entry in Development Storage for this blob. Note that subst can be invoked in a command window to view the list of currently mounted Azure Drives. Azure Drives are mounted and unmounted in the Development Environment just as they are in the cloud. However, DriveMountOptions.Force is not implemented in the Development Environment. It is important to remember that Azure Drives are available only inside the Azure Fabric – cloud or development – and that they are not mountable in an ordinary Windows application. 54
  • 55. The Windows Azure Platform: Articles from the Trenches AZURE TABLE SERVICE AS A NOSQL DATABASE By Mark Rendle The Windows Azure SDK is one of the things which sets the Azure platform above other “cloud” platforms. The Table Service SDK, in particular, wraps the massively scalable storage service in an API which is instantly familiar to anyone who has used LINQ-to-SQL or the Entity Framework. CLR property names are used as column names, class names are (by default) used as table names. But this simplicity enforces the concept of schema over a data store which is innately schema-less. When you create an Azure Table, you do not specify columns. The table itself is not structured in that way. The column names are part of the entities (rows) which are stored in the table, and they can be different for different entities within a single table. This fact opens up a world of interesting possibilities when it comes to planning and designing your persistence layer. MASTER-DETAIL STRUCTURES The Table Service does not support relational features, such as primary/foreign keys, joins in queries, or transactions to coordinate modifications across multiple tables. But Table Service entities with the same partition key are held together in the store, can be retrieved very quickly with a single query, and can be modified together inside Entity Group Transactions 15. Because rows can have different structures, you can actually store the data from two (or more) different types of object within the same partition key in the same table. Let’s say, for example, you are storing Invoices, with an arbitrary number of line items for each invoice. Using the invoice number for the Partition Key, an empty string value for the Row Key of the Invoice entity, and then sequential numbering for the Row Keys of the LineItem entity, this “master-detail” data can be created in a single transaction, and retrieved incredibly quickly in a single query. Figure 1: Invoice and Line Item entities stored within a single Azure Table DYNAMIC SCHEMA It’s very common these days for database applications to allow the end user to extend the out-of- the-box data model with their own fields. In a relational database, this is commonly achieved with a complicated system of metadata tables, and performance when querying against these custom fields is accordingly horrible. 15 55
  • 56. The Windows Azure Platform: Articles from the Trenches In Azure, these fields can be added to each entity just by specifying the extra column names and values in the Insert operation. And then subsequently, querying against these columns can be done in exactly the same way as against the columns that were part of the original application. Be aware, though, that this approach requires you to get down and dirty with the REST API, where you have complete control over column names at the per-entity level. Also be aware that there is a hard limit of 255 properties per entity, including the Partition Key, Row Key and Timestamp system columns. COLUMN NAMES AS DATA There are times when you want to store several thousand rows of related data; things like activity logs, or relationships between users in a social-networking database. Azure Tables can handle this volume of data very easily, but because a query operation can only return 1,000 rows per result set, retrieving them all could take several round-trips to the server, increasing the time of the operation and the cost of the transactions. If the data can be stored as a single string or binary blob, though, you can group 250 “rows” together in a single entity, using the column name as a makeshift sub-key. This is possible because there is absolutely no limit to the number of different column names that can be used within a table. The best way to achieve this is to use an empty Row Key to identify the “active” entity; that is, the one with spare columns to add data into. In addition, have an extra column, named “UniqueId”, with a timestamp or Guid value. By running MERGE updates against this entity, you can add new “rows”; when the MERGE operation fails with a “too many values” error, you simply create a copy of that entity (Row Keys cannot be updated) with the UniqueId value as the Row Key, and reset the active entity to clear all the values and set a new UniqueId: this prevents two simultaneous operations from creating duplicate copies of the entity. Figure 2: Using column names as data to reduce number of rows used for high-volume tables TABLE NAMES AS DATA Another thing which is not limited is the number of tables you can create within an account or project. And because you don’t have to specify the schema of each table, creating hundreds of them is not prohibitive. One obvious use for this is where you need short-lived, high-volume tables, perhaps to contain analytics data which gets archived and cleared down after a couple of weeks (to cut down on storage costs). Running hundreds of DELETE operations against a single table comes with scalability issues and incurs a high transaction cost. But if you create multiple tables with the date as part of the name, clearing down a day’s data is just a matter of dropping the table; one quick operation. 56
  • 57. The Windows Azure Platform: Articles from the Trenches Figure 3: Using table names as part of data schema SUMMARY As you can see, the scope for creative schema design in Azure Table storage is massive. This is one of the best things about the NoSQL family of databases: many of the problems with which we have traditionally struggled when using rigidly-structured relational databases have much simpler, more direct solutions in a less-structured paradigm. Whilst the official Microsoft Azure SDK is a great tool for modelling a lot of domains, and provides a very usable interface to the powerful features of the Azure storage stack with its familiar LINQ DataContexts and Query providers, I hope this short article has highlighted a few of the things you can achieve by digging deeper into the SDK, or ignoring it entirely and learning to use the REST API to fully exploit the NoSQL nature of Azure Table Storage. 57
  • 58. The Windows Azure Platform: Articles from the Trenches QUERIES AND AZURE TABLES By Neil Mackenzie CREATEQUERY<T>() There are several classes involved in querying Azure Tables using the Azure Storage Client library. However, a single method is central to the querying process and that is CreateQuery<T>() in the DataServiceContext class. CreateQuery<T>() is declared: public DataServiceQuery<T> CreateQuery<T>(String entitySetName); This method is used implicitly or explicitly in every query against Azure Tables using the Storage Client library. The CreateQuery<T>() return type is DataServiceQuery which implements both the IQueryable<T> and IEnumerable<T> interfaces: LINQ supports the decoration of a query by operators filtering the results of the query. Although a full LINQ implementation has many decoration operators only the following are implemented for the Storage Client library:  Where  Take  First  FirstOrDefault These are implemented as extension methods on the DataServiceQuery<T> class. When a query is executed these decoration operators are translated into the $filter and $top operators used in the Azure Table Service REST API query string submitted to the Azure Table Service. The following example demonstrates a trivial use of CreateQuery<T>() and the Take() operator to retrieve ten records from a Songs table: protected void SimpleQuery(CloudTableClient cloudTableClient) { TableServiceContext tableServiceContext = cloudTableClient.GetDataServiceContext(); tableServiceContext.ResolveType = (unused) => typeof(Song); IQueryable<Song> songs = (from entity in tableServiceContext.CreateQuery<Song>("Songs") select entity).Take(10); List<Song> songsList = songs.ToList<Song>(); } As with other LINQ implementations the query is not submitted to the Azure Table Service until the query results are enumerated. Note the use of ResolveType to work around a performance issue when the table name differs from the class name. 58
  • 59. The Windows Azure Platform: Articles from the Trenches The MSDN Azure documentation has a page showing several examples of LINQ queries demonstrating filtering on properties with the various datatypes – String, numbers, Boolean and DateTime – so they will not be repeated here. Instead, this article focuses on the various methods provided to invoke queries. CONTEXTS The SimpleQuery example used the TableServiceContext.CreateQuery() method as follows: tableServiceContext.CreateQuery<Song>("Songs") This syntax can be simplified by deriving a class from TableServiceContext as follows: public class SongContext : TableServiceContext { internal static String TableName = "Songs"; public SongContext(String baseAddress, StorageCredentials credentials) : base(baseAddress, credentials) { } public IQueryable<Song> Songs { get { return this.CreateQuery<Song>(TableName); } } public void AddSong(Song song) { this.AddObject(TableName, song); this.SaveChanges(); } } This class is specific to the Song model class representing the entities in the Azure table named Songs. The Songs property can be used as the core of any LINQ query instead of the tableServiceContext.CreateQuery<Song>(“Songs”) used previously. Doing this simplifies and improves the readability of the LINQ query. For example, the LINQ query: from entity in tableServiceContext.CreateQuery<Song>("Songs") select entity can be rewritten as: from entity in songContext.Songs select entity where songContext is a SongContext object. QUERYING ON PARTITIONKEY AND ROWKEY The primary key for an entity in an Azure table comprises PartitionKey and RowKey. The most performant query in the Azure Table Service is one specifying both PartitionKey and RowKey returning a single entity. When handling a query specifying PartitionKey and not RowKey the Azure 59
  • 60. The Windows Azure Platform: Articles from the Trenches Table Service scans every entity in the partition while for a query specifying RowKey and not PartitionKey it must query each partition separately. CONTINUATION A query specifying both PartitionKey and RowKey is the only query guaranteed to return its entire result set in a single response. A further limit on query results is that no more than 1,000 results are ever returned in response to a single request – regardless of how many entities satisfy the query filter. The Azure Table Service inserts a continuation token in the response header to indicate there are additional results which can be retrieved through an additional request parameterized by the continuation token. DATASERVICEQUERY DataServiceQuery is the WCF Data Services class representing a query to the Azure Table Service. DataServiceQuery provides the following methods to send queries to the Azure Table Service. public IAsyncResult BeginExecute(AsyncCallback callback, Object state); public IEnumerable<TElement> EndExecute(IAsyncResult asyncResult); public IEnumerable<TElement> Execute(); Execute() is a synchronous method which sends the query to the Azure Table Service and blocks until the query returns. BeginExecute() and EndExecute() are a matched pair of methods used to implement the AsyncCallback Delegate model for asynchronously accessing the Azure Table Service. The following is an example of Execute(): protected void UsingDataServiceQueryExecute(CloudTableClient cloudTableClient) { TableServiceContext tableServiceContext = cloudTableClient.GetDataServiceContext(); tableServiceContext.ResolveType = (unused) => typeof(Song); DataServiceQuery<Song> dataServiceQuery = (from entity in tableServiceContext.CreateQuery<Song>("Songs") select entity).Take(10) as DataServiceQuery<Song>; IEnumerable<Song> songs = dataServiceQuery.Execute(); foreach (Song song in songs) { String singer= song.Singer; } } Note that the query must be explicitly cast from an IQueryable<Song> to a DataServiceQuery<Song>. The asynchronous model is implemented by invoking BeginExecute() passing it the name of a static callback delegate and, optionally, an object providing some invocation context to the callback delegate. In practice, this object must include the DataServiceQuery object on which BeginExecute() was invoked. BeginExecute() initiates query submission and sets up an IO Completion Port to wait for the query to complete. When it completes, the callback delegate is invoked on a worker thread. EndExecute() must be invoked in the callback delegate to access the results. Furthermore, a failure to invoke EndExecute() could lead to resource leakage. EndExecute() returns an object of type 60
  • 61. The Windows Azure Platform: Articles from the Trenches QueryOperationResponse<T> which implements an IEnumerable<T> interface. QueryOperationResponse<T> exposes information about the query request and response including the HTTP status of the response. Note that the version of WCF Data Services currently used in Azure does not support server-side paging so that a DataServiceQuery is not able to process continuation tokens. The next version does but is not yet released in the Azure environment. Consequently, DataServiceQuery.Execute() may not retrieve all the entities requested if there are more than 1,000 of them – or, indeed, if there is a need for continuation tokens which can happen on any query not specifying both PartitionKey and RowKey. CLOUDTABLEQUERY The CloudTableQuery<T> class supports continuation tokens. A CloudTableQuery<T> object is created using one of the two constructors: public CloudTableQuery<TElement>(DataServiceQuery<TElement> query, RetryPolicy policy); public CloudTableQuery<TElement>(DataServiceQuery<TElement> query); or the AsTableServiceQuery() extension method of the TableServiceExtensionMethods class: public static CloudTableQuery<TElement> AsTableServiceQuery<TElement> ( IQueryable<TElement> query ) The CloudTableQuery<T> class has the following synchronous methods to handle query submission to the Azure Table Service: public IEnumerable<TElement> Execute(ResultContinuation continuationToken); public IEnumerable<TElement> Execute(); Execute() handles continuation automatically and continues to submit queries to the Azure Table Service until all the results have been returned. Execute(ResultContinuation) starts the request with a previously acquired ResultContinuation object encapsulating a continuation token and continues the query until all results have been retrieved. Note that care should be taken when using either form of Execute() since large amounts of data might be returned when the query is enumerated. 61
  • 62. The Windows Azure Platform: Articles from the Trenches The following example shows Execute() retrieving all the records from a table: protected void UsingCloudTableQueryExecute(CloudTableClient cloudTableClient) { TableServiceContext tableServiceContext = cloudTableClient.GetDataServiceContext(); tableServiceContext.ResolveType = (unused) => typeof(Song); CloudTableQuery<Song> cloudTableQuery = (from entity in tableServiceContext.CreateQuery<Song>("Songs") select entity).AsTableServiceQuery<Song>(); IEnumerable<Song> songs = cloudTableQuery.Execute(); foreach (Song song in songs) { String singer = song.Singer; } } The CloudTableQuery<T> class has an equivalent set of asynchronous methods declared: public IAsyncResult BeginExecuteSegmented(ResultContinuation continuationToken, AsyncCallback callback, Object state); public IAsyncResult BeginExecuteSegmented(AsyncCallback callback, Object state); public ResultSegment<TElement> EndExecuteSegmented(IAsyncResult asyncResult); These follow the method-naming style used elsewhere in the Storage Client library whereby the suffix Segmented indicates that the methods bring data back in batches – in this case from one continuation token to the next. This provides a convenient method of paging through results in batches of size specified by the Take() query decoration operator or the 1,000 records that is the maximum number of records retrievable in a single request. As with the synchronous Execute() methods the difference between the two BeginExecuteSegmented() methods is that one starts the retrieval at the beginning of the query result set while the other starts at the entity indicated by the continuation token in the ResultContinuation parameter. The following is an example of BeginExecuteSegmented() and EndExecuteSegmented() paging through the result set of a query in pages of 10 entities at a time: protected void QuerySongsExecuteSegmentedAsync( CloudTableClient cloudTableClient) { TableServiceContext tableServiceContext = cloudTableClient.GetDataServiceContext(); tableServiceContext.ResolveType = (unused) => typeof(Song); CloudTableQuery<Song> cloudTableQuery = (from entity in tableServiceContext.CreateQuery<Song>("Songs").Take(10) select entity ).AsTableServiceQuery<Song>(); IAsyncResult iAsyncResult = cloudTableQuery.BeginExecuteSegmented( BeginExecuteSegmentedIsDone, cloudTableQuery); } 62
  • 63. The Windows Azure Platform: Articles from the Trenches static void BeginExecuteSegmentedIsDone(IAsyncResult result) { CloudTableQuery<Song> cloudTableQuery = result.AsyncState as CloudTableQuery<Song>; ResultSegment<Song> resultSegment = cloudTableQuery.EndExecuteSegmented(result); List<Song> listSongs = resultSegment.Results.ToList<Song>(); if (resultSegment.HasMoreResults) { IAsyncResult iAsyncResult = cloudTableQuery.BeginExecuteSegmented( resultSegment.ContinuationToken, BeginExecuteSegmentedIsDone, cloudTableQuery); } } It is also possible to iterate through subsequent results using the GetNext() method of the ResultSegment<T> class rather than using BeginExecuteSegmented() with a ResultContinuation parameter. It is worth noting the difference made by replacing the cloudTableQuery in the above example with: CloudTableQuery<Song> cloudTableQuery = (from entity in tableServiceContext.CreateQuery<Song>("Songs") select entity).Take(10).AsTableServiceQuery<Song>(); Here, the Take(10) is outside the LINQ query definition. This query results in the retrieval of only 10 records and does not page through the table in pages of 10 entities as in the previous example. Note that exception handling is even more important in callback delegates than it is in normal code because they are not invoked from user code and errors cannot be caught outside the method. Consequently, all errors must be caught and handled inside the callback delegate. 63
  • 64. The Windows Azure Platform: Articles from the Trenches TRICKS FOR STORING TIME AND DATE FIELDS IN TABLE STORAGE By Saksham Gautam Windows Azure Table Storage supports storing enormous amount of data in massively scalable tables in the cloud. The tables can store terabytes upon terabytes of data and billons of entities. In order to attain this amount of scalability, Windows Azure Table storage employs a scale-out model to distribute entities across multiple storage nodes. Each application has to decide on the partition scheme by choosing the partition-keys for the entities. Moreover, each entity within a partition is uniquely identified by its row-key. In this section, we discuss how to use the two types of entity keys in order to simulate a descending order based on timestamps so that queries based on dates are more efficient. Entity keys, PartitionKey and RowKey, are strings of up to 1KB in size. As they are strings, all comparisons are purely lexicographic, i.e. “100” < “20” < “9”. At first glance making a key from time might seem very straightforward. “Just use ‘yyyyMMddHHmmssfffff’ pattern for the DateTime“, you might say. Using fixed length for different components of the time would indeed ensure that lexical comparisons are equivalent to DateTime comparisons. However, the entities would be arranged in an ascending order within the table. As many real life applications are interested in fetching the most recent entities first, the queries are inadvertently less efficient using this simple method. Let’s examine this more closely using an example. Let us assume that we are making a location based service application that lets mobile users send periodic position reports to a Windows Azure Worker Role, which in turn logs the reports to the table storage. A ‘PositionReport’ entity could look something like that shown in Table 1. Property DataType PartitionKey String RowKey String DeviceId String ReportedOn DateTime Latitude Double Longitude Double Table 1 Properties of a PositionReport entity Also, suppose that the majority of queries would be something like, “Get 100 most recent position reports of device X” so that they could be displayed on a map. If we used the ascending order model, our query would first have to fetch all the entities from the table (or partition), then get the last 100 entities. There is a way to fetch just the 100 entities that you need. The clue here lies on the ReportedOn property. Let’s convert the time the device reported into ‘reverse timestamp’ by simply doing the following: (DateTime.MaxValue.Ticks - reportedOn.Ticks).ToString().PadLeft(19) 64
  • 65. The Windows Azure Platform: Articles from the Trenches By reversing the number of ‘ticks’ in the time and then making it of fixed length, we create a mechanism for assigning newer entities with keys that are lexically less than those of older entities. We could use this as the RowKey for our entity. Then, we would not need to have an additional property for storing the time the device sent the position report because we could easily compute it using the RowKey as shown below. As a result, we save some bandwidth as well. new DateTime(DateTime.MaxValue.Ticks - Int64.Parse(RowKey)) The prime candidate for the partition-key would be the ID of the device, so that all entities for a single device go into one partition. However, if a device sends many position reports over time, our partition might grow enormous. Choosing a partition key is an opportunity to load balance the entities across different servers. Hence, we can definitely do better than choosing a fixed partition key. We could use a similar technique to the one we used for our RowKey. Without the loss of generality let us assume that we could keep all position reports for a device within a month in one partition. With that, constructing the reverse timestamp for our partition key is easy. We could do the following. DateTime temp = new DateTime(reportedOn.Year, reportedOn.Month, 1); We then create an identifier by concatenating the ID of the device with the reversed timestamp based on the time the device reported. But first we have to decide whether the device ID is significant in queries that we want to perform, or whether the ReportedOn property is more significant. In other words, are most of your queries something like “Give me position reports for device X” or are they more like “Give me devices that have reported within a certain interval”. Based on that, we determine whether our partition key would have timestamp or the device ID as the prefix of the partition key. Let us assume that we decided to use the device ID as prefix. Once we have our partition key, we could easily recalculate the device ID. Note that creating entity key by concatenation in this way only works if the device id is of fixed length. String deviceId = PartitionKey.Substring(0, PartitionKey.Length - 19); The PositionReport entity now looks like the one shown in Table 2. 65
  • 66. The Windows Azure Platform: Articles from the Trenches Property DataType PartitionKey String RowKey String Latitude Double Longitude Double GetReportedOn() Returns DateTime GetDeviceId() Returns String Table 2 Modified PositionReport entity As for the queries based on time, we can construct them such that we include at least one of the entity keys and preferably (always) the partition key, as illustrated in the following examples. 1. 100 most recent entities for the device within this month a. Compute the combined partition key based on the device id and the reversed timestamp using the first day of the current month b. Query the table using greater than (>) operator on the row key and equal to (=) operator on the values computed in 1.a. 2. 100 most recent entities for the device a. Note that all partition keys for entities belonging to a particular device are created by appending a suffix to the device ID. Hence, query the table storage using greater than (>) operator on the partition key. b. Since a device may not have 100 position reports, the entities returned by the query may contain entities corresponding to other devices. You should remove them in the data access layer before you use the result in the application. c. Note that we did not use a (>) and (<) operators to filter out results at the table storage itself, but instead chose to filter the results in our code. This is because as of time of this writing, if a range query is based on partition keys, i.e. it contains ‘AND’ or ‘OR’ keyword, it results in a full table scan. 3. 100 most recent entities for the device within a specific period. a. Construct the combined partition keys for the dates that define the interval as in 1.a. b. Construct the row keys by using reverse timestamps for the dates. c. As mentioned in 2.c, it is not efficient to use all the keys in a single query. Hence, create two queries, each using one partition key and one row key. d. Perform the two queries and combine (union) the entities before using them in the application. 66
  • 67. The Windows Azure Platform: Articles from the Trenches Using reverse ticks for entity keys should be sufficient in most of the cases. However, there might be scenarios when there is a lot of data generated, and when you compute the entity keys using the method described above more than one entity might try to use the same keys. Take an example of an application in which there are multiple processes that add ‘Event’ entities to the ‘Events’ table. An ‘Event’ entity could look something like that shown in Table 3. Property DataType PartitionKey String RowKey String EventType Int Description String GetEventSource() Returns String GetEventTime() Returns DateTime Table 3 Structure of Event Entity Partition key is based on the Event source which could be the name of the process that generated the event and Row key is the based on event time. If there are multiple processes with the same name that write into the table at the exact same time, we would run into problems because both partition key and row keys have to be unique. The solution to avoid such entity key collisions is to append a globally unique identifier (GUID) to the end of the row key. Hence, the row keys would be computed like so. String revTicks = (DateTime.MaxValue.Ticks – eventTime.Ticks).PadLeft(19); RowKey = revTicks + Guid.NewGuid().ToString(); Care has to be taken while querying. Since the row keys don’t correspond directly to ticks anymore, it is not correct to use <= and >= operators in the queries when used with row keys. To get all events that occurred on time = T, one has to convert T and (T + 1 tick) into row keys and use them in the where condition in your query. One might ask, can this technique be used on a Table that is already using only reverse timestamps as row keys. The short answer is yes! As we discussed earlier, comparisons on entity keys are purely lexicographical. If there are three strings A, B and C, and if lexically A < B < C, adding any suffix to either one or all of them does not affect how they are ordered afterwards. 67
  • 68. The Windows Azure Platform: Articles from the Trenches USING WORKER ROLES TO IMPLEMENT A DISTRIBUTED CACHE By Josh Tucholski One of the most sought after goals of an aspiring application, viral growth, is also one of the quickest routes to failure if the application receives it unexpectedly. Windows Azure addresses the problem of viral growth by supporting a scalable infrastructure to scale and quickly allocate additional instances of a service on an as-needed basis. However as traffic and use of an application grows, it is inevitable that its database will suffer without any type of caching layer in place. In smaller environments, it is sufficient to use the built-in cache that a server provides for efficient data retrieval. This is not the case in Windows Azure. Windows Azure provides a transparent load balancer, thereby making the placement of data into specific server caches impractical unless one can guarantee each user continually communicates with the same web server. Armed with a distributed cache and a well-built data access tier, one can address this issue to ensure that all clients that issue similar data requests only retrieve to the database once and use the cached version going forward (pending updates). CONFIGURING THE CACHE One of the most popular distributed caching implementations is memcached , used by YouTube, Facebook, Twitter, and Wikipedia. Memcached can run from the command line as an executable, making it a great use case for a Windows Azure worker role. When memcached is active all of its data is stored in memory which makes increasing the size of the cache as easy as increasing the worker instance count. The following code snippet demonstrates how a worker role initializes the memcached process and defines required parameters identifying its unique instance IP address and the maximum size of the cache in MB. Note: Your will need to include the memcached executable to start the process within the Windows Azure app fabric. //Retrieve the Endpoint information for the current role instance IPEndPoint endpoint = RoleEnvironment.CurrentRoleInstance.InstanceEndpoints["EndpointName"].IPEndpoint; string cacheSize = RoleEnvironment.GetConfigurationSettingValue(CacheSizeKey); //memcached arguments //m = size of the cache in MB //l = IP address of the cache server //p = port address of the cache server string arguments = "-m " + cacheSize + " -l " + endpoint.Address + " -p " + endpoint.Port; ProcessStartInfo startInfo = new ProcessStartInfo() { CreateNoWindow = true, UseShellExecute = false, FileName = "memcached.exe", Arguments = arguments }; //The worker role’s only purpose is to execute the memcached process and run until shutdown using (Process exeProcess = Process.Start(startInfo)) { exeProcess.WaitForExit(); } 68
  • 69. The Windows Azure Platform: Articles from the Trenches USING THE DISTRIBUTED CACHE Once the distributed cache instance is active, a client library, such as Enyim, is used to access the contents of the cache. The most challenging part of integrating Enyim with the distributed cache is identifying all of the cache endpoints available for access. Fortunately when using the Windows Azure API, internal endpoints are discoverable by worker role name. Hooks can be added to determine if at any point in time the client is out of sync with the actual number of worker role instances, causing an automatic refresh. The Web Role found in the Windows Azure Memcached Solution Accelerator, has a well written implementation of the Enyim client demonstrating how to detect its configuration. The following code snippet shows how simple it is to retrieve and store data in the cache once the configuration interface is implemented: //See the Windows Azure Memcached Solution Accelerator for instructions on implementing //the AzureMemcachedClientConfiguration class private static AzureMemcachedClientConfiguration _configuration; //MemcachedClient is provided through Enyim private static MemcachedClient _client; private static MemcachedClient Client { get { EnsureClientUpToDate(); return _client; } } private static void EnsureClientUpToDate() { //If a configuration exists, confirm that the endpoints it is //aware match the ones in Windows Azure if (_client == null || _configuration == null || _configuration.IsOutOfDate) { _configuration = new AzureMemcachedClientConfiguration(); _client = new MemcachedClient(_configuration); } } public object Get(string key) { //The client serves four key purposes: retrieval, storage, removing, and flushing the cache object val = Client.Get(key); return val; } public void Put(string key, object value) { //Stores the key/value pair in the distributed cache using the client //Available StoreMode operations //Add – adds the item to the cache only if it does not exist //Replace – replaces an item in the cache only if it does exist //Set – will add the item if it does not exist or replace it if it does if (!Client.Store(StoreMode.Set, key, value, DateTime.UtcNow.AddSeconds((double)_expiry))) { Console.WriteLine("MemcachedCache - could not save key " + key); } } Once the implementation of the client is in place, any part of the application that has access to the client can integrate with the distributed cache. Certain object-relational mapping tools, such as nHibernate, even contain support for cache providers. From this point, it is simple to construct a new cache provider and integrate with the Windows Azure distributed cache. I recommend hashing 69
  • 70. The Windows Azure Platform: Articles from the Trenches your object keys if any other library integrates with your distributed caching to avoid any name collisions. Implementing a distributed cache in most scenarios has proven beneficial as long as the application controls the data flowing in and out. If external resources are modifying the data by communicating to the database directly, you need to rethink the architecture of your distributed application or at least invalidate your cache more frequently to ensure that data within it is not stale. With Windows Azure, your distributed cache will consistently have high availability and never receive interruptions when adding additional instances or recovering from system failures. 70
  • 71. The Windows Azure Platform: Articles from the Trenches LOGGING, DIAGNOSTICS AND HEALTH MONITORING OF WINDOWS AZURE APPLICATIONS By David Gristwood Monitoring the health of an application is key to being able to keep it up and running and to help resolve problems as and when they arise. Most developers know how to do this to some degree with on-premise application, as part of the general maintenance of the application and server. However, an Azure application, running up in the cloud, is very different to a traditional application when it comes to monitoring and performing diagnostics, for many reasons. Firstly, the application will typically be running across a whole set of machines, managed by the Windows Azure fabric, and is dynamic and will change over time, so the problem is much more complex than that of a single machine. Secondly, there is no direct, system admin access to the machines running in Windows Azure, in the way you would have if you owned and managed the machines yourself – the Windows Azure fabric handles much of the complexities of deploying and managing roles and machines, and it doesn’t provide low level access to resources. And, finally, you can’t just attach a debugger to the cloud and step through your code. Fortunately Windows Azure has a diagnostic capability that allows you to monitor the health of your application across the different roles that make up your Azure application. It’s not about creating new APIs – but rather it’s about using the existing logging and tracing capabilities in the Windows platform that many developers are already familiar with, and building a monitoring strategy based on them, to cover scenarios such as debugging, troubleshooting, performance, resource usage monitoring, traffic analysis, capacity planning, and auditing. There are three key stages to using the Windows Azure diagnostics. Firstly, deciding what diagnostic data you wish to collect. Secondly, deciding when and what diagnostic data should be persisted out to Windows Azure for analysis. And finally, downloading the data from Windows Azure for analysis. COLLECTING DIAGNOSTIC DATA One of the most common pieces of diagnostic data to collect is the Windows Azure logs, which is where any System.Diagnostics.Trace messages embedded in an application are output to. These Trace messages are the main way to log the flow and status of an application and are built on the existing Event Tracing for Windows (ETW) capabilities. By default this trace data, the IIS 7.0 logs and the Windows Diagnostic infrastructure logs are all collected when you switch on the diagnostics. The Windows Diagnostic infrastructure logs help provide general purpose problem detection, troubleshooting, and resolution for Windows components. Diagnostics are initialized within a role’s OnStart() method: public override bool OnStart() { var config = DiagnosticMonitor.GetDefaultInitialConfiguration(); // Get default initial configuration 71
  • 72. The Windows Azure Platform: Articles from the Trenches // add any other data sources here that need to be tracked are added here DiagnosticMonitor.Start("DiagnosticsConnectionString", config); Additional data sources can be added to the DiagnosticsMonitor before the Start() method is called. For diagnostics, the Crash dumps and Windows Event logs can prove invaluable. For fine tuning and capacity planning the Performance counters (which include CPU, memory, paging, etc.) are essential. PERSISTING DIAGNOSTIC DATA All the diagnostic data collected is stored in the local file store of the virtual machine within the Windows Azure fabric. The local file store will not survive machine recycles or rebuilds and therefore the diagnostic data needs to be transferred to a persistent store, such as Windows Azure storage. These transfers can take place either as regular scheduled events, perhaps every 10 minutes, or on demand. Setting up a scheduled transfer is as easy as setting up the ScheduledTransferPeriod property on the appropriate data source before the call to DiagnosticMonitor.Start(): // schedule transfer of basic logs to Azure storage diagConfig.Logs.ScheduledTransferPeriod = System.TimeSpan.FromMinutes(1.0); DiagnosticMonitor.Start("DiagnosticsConnectionString", config); An on demand transfer can be initiated from outside a Windows Azure application, which makes it possible to control persisting data from a dashboard or system support application. ANALYSING THE DIAGNOSTIC DATA The default behaviour of a transfer of diagnostic data is to persist the data to a set of “wad” (Windows Azure Diagnostics) prefixed Windows Azure Tables and Blobs containers (Crash dumps go into Blob storage, Windows Azure logs into Tables, etc.). These can then be inspected on line, with tools such as Cerebrata’s Azure Diagnostic Manager (see screenshot), or downloaded using the REST-based API for local viewing and analysis. 72
  • 73. The Windows Azure Platform: Articles from the Trenches For analysis, turning, or resolving more complex issues, storing log and trace information in SQL Server will make it easier to filter the relevant information and detect process flow, exceptions, etc. As with all debugging and monitoring scenarios, the key is to ensure good quality information is embedded within applications, especially to help track flow across multiple machine and roles. MORE INFORMATION You can view Matthew Kerner’s excellent PDC09 session and the demos from the talk can be downloaded from . The MSDN documentation can be found at 73
  • 74. The Windows Azure Platform: Articles from the Trenches SERVICE RUNTIME IN WINDOWS AZURE By Neil Mackenzie ROLES AND INSTANCES Windows Azure implements a Platform as a Service model through the concept of roles. There are two types of role: a web role deployed with IIS; and a worker role similar to a windows service. Azure implements horizontal scaling of a service through the deployment of multiple instances of roles. Each instance of a role is allocated exclusive use of a VM selected from one of several sizes - from a small instance with 1 core to an extra-large instance with 8 cores. Memory and local disk space also increase with instance size. All inbound network traffic to a role passes through a stateless load balancer which uses an unspecified algorithm to distribute inbound calls to the role among instances of the role. Individual instances do not have public IP addresses and are not directly addressable from the Internet. Instances can connect directly to other instances in the service using TCP and HTTP. Azure provides two deployment slots: staging for testing in a live environment; and production for the production service. A role with a public endpoint has a permanent URL in the production slot and a temporary URL in the staging slot. Otherwise, there is no real difference between the two slots. ENDPOINTS An Azure role has two types of endpoint: a public-facing input endpoint; and a private internal endpoint for communication among instances. Input endpoints and internal endpoints are associated with an Azure role through specification in the Service Definition file. A web role may have only one HTTP input endpoint and one HTTPS input endpoint. A worker role may have an unlimited number of HTTP, HTTPS and TCP input endpoints as long as each is associated with a different port number. External services make connection requests to the Virtual IP address for the role and the input endpoint port specified for the role in the Service Definition file. These connection requests are load balanced and forwarded to an Azure-allocated port on one of the instances of the role. A web role may have only one HTTP internal endpoint. A worker role may have an unlimited number of HTTP and TCP internal endpoints, the only limitation being that each internal endpoint must have a unique name. SERVICE UPGRADES There are two ways to upgrade an Azure service: in-place upgrade and Virtual IP (VIP) swap. An in- place upgrade replaces the contents of a deployment slot with a new Azure application package and configuration file. A VIP swap simply swaps the virtual IP address associated with the production and staging slots. Note that it is not possible to do an in-place upgrade where the new application package has a modified Service Definition file. Instead, any existing service in one of the slots must 74
  • 75. The Windows Azure Platform: Articles from the Trenches be deleted before the new version is uploaded. A VIP swap does support modifications to the Service Definition file. The Azure SLA comes into force only when a service uses at least two instances per role. Azure uses upgrade domains and fault domains to facilitate adherence to the SLA. The Azure fabric deploys instances over several upgrade domains. The Azure fabric implements an in-place upgrade of a role by bringing down all the instances in a single upgrade domain, upgrading them, and then restarting them before moving on to the next upgrade domain. The number of upgrade domains is configurable through the upgradeDomainCount attribute (default 5) to the ServiceDefinition root element in the Service Definition file. The Azure fabric completely controls the allocation of instances to upgrade domains though an Azure service can view the upgrade domain for each of its instances through the RoleInstance.UpdateDomain property. When Azure instances are deployed, the Azure fabric spreads them among different fault domains which means they are deployed so that a single hardware failure does not bring down all the instances. The Azure fabric completely controls the allocation of instances to fault domains though an Azure service can view the fault domain for each of its instances through the RoleInstance.FaultDomain property. SERVICE DEFINITION AND SERVICE CONFIGURA TION An Azure service is defined and configured through its Service Definition and Service Configuration files. The Service Definition file specifies the roles contained in the service along with the following for each role:  upgradeDomainCount - number of upgrade domains for the service  vmsize - the instance size from Small through ExtraLarge  ConfigurationSettings - defines the settings used to configure the service  LocalStorage- specifies the amount and name of disk space on the local VM  InputEndpoints - defines the external endpoints for a role  InternalEndpoint - defines the internal endpoints for a role  Certificates - specifies the name and location of the X.509 certificate store The Service Configuration file provides the configured values for:  osVersion - specifies the Azure guest OS version for the deployed service  Instances - specifies the number of instances of a role  ConfigurationSettings - specifies the role-specific configuration parameters  Certificates - specifies X.509 certificates for the role The Service Configuration file comprises one of the two distinct parts of the service application package and consequently can be modified through an in-place upgrade. It can also be modified directly on the Azure portal. ROLEENTRYPOINT 75
  • 76. The Windows Azure Platform: Articles from the Trenches RoleEntryPoint is the base class providing the Azure fabric an entry point to a role. All worker roles must contain a class derived from RoleEntryPoint but web roles can use ASP.Net lifecycle management instead. The standard Visual Studio worker role template provides a starter implementation of the necessary derived class. RoleEntryPoint is declared: public abstract class RoleEntryPoint { protected RoleEntryPoint(); public virtual Boolean OnStart(); public virtual void OnStop(); public virtual void Run(); } The Azure fabric initializes the role by invoking the overridden OnStart() method. Prior to this call the status of the role is Busy. Note that a web role can put initialization code in Application_Start instead of OnStart(). The overridden Run() is invoked following successful completion of OnStart() and provides the primary working thread for the role. An instance recycles automatically when Run() exits so care should be taken, through use of Thread.Sleep() for example, that the Run() method does not terminate. Azure invokes the overridden OnStop() during a normal suspension of the role. The Azure fabric stops the role automatically if OnStop() does not return within 30 seconds. Note that a web role can put shutdown code in Application_End instead of OnStop(). ROLE The Role class represents a role in an Azure service. It exposes the Name of the role and a collection of deployed Instances for it. ROLEENVIRONMENT The RoleEnvironment class provides functionality allowing an instance to interact with the Azure fabric as well as functionality providing access to the Service Configuration file and limited access to the Service Definition file. 76
  • 77. The Windows Azure Platform: Articles from the Trenches RoleEnvironment is declared: public sealed class RoleEnvironment { public static event EventHandler<RoleEnvironmentChangedEventArgs> Changed; public static event EventHandler<RoleEnvironmentChangingEventArgs> Changing; public static event EventHandler<RoleInstanceStatusCheckEventArgs> StatusCheck; public static event EventHandler<RoleEnvironmentStoppingEventArgs> Stopping; public static RoleInstance CurrentRoleInstance { get; } public static String DeploymentId { get; } public static Boolean IsAvailable { get; } public static IDictionary<String,Role> Roles { get; } public static String GetConfigurationSettingValue( String configurationSettingName); public static LocalResource GetLocalResource(String localResourceName); public static void RequestRecycle(); } The IsAvailable property specifies whether or not the Azure environment is available. DeploymentId identifies the current deployment, Roles specifies the roles contained in the current service, and CurrentRoleInstance is a RoleInstance object representing the current instance. Note that Roles reports all Instances as being of zero size except the current instance and any instance with an internal endpoint. GetConfigurationSettingValue() retrieves a configuration setting for the current role from the Service Configuration file. GetLocalResource() returns a LocalResource object specifying the root path for any local storage for the current role defined in the Service Definition file. RequestRecycle() initiates a recycle, i.e., stop and start, of the current instance. The RoleEnvironment class also provides four events to which a role can register a callback method to be notified about various changes to the Azure environment. A role typically registers callback methods with these events in its OnStart() method. The StatusCheck event is raised every 15 seconds. An instance can use the SetBusy() method of the RoleInstanceStatusCheckEventArgs class to indicate it is busy and should be taken out of the load- balancer rotation. The Stopping event is raised when an instance is undergoing a controlled shutdown although there is no guarantee it will be raised when an instance is shutting down due to an unhandled error. Note that the Stopping event is raised before the overridden OnStop() method is invoked. The Changing event is raised before and the Changed event after a configuration change is applied to the role. The callback method for the Changing event has access to the old value of the configuration setting and can be used to control whether or not the instance should be restarted in response to the configuration change. The callback method for the Changed event has access to the new value of the configuration setting and can be used to reconfigure the instance in response to the change. The Changing and Changed callback methods are also used to handle topology changes to the service in which the number of instances of a role is changed. ROLEINSTANCE 77
  • 78. The Windows Azure Platform: Articles from the Trenches The RoleInstance class represents an instance of a role. It is declared: public abstract class RoleInstance { public abstract Int32 FaultDomain { get; } public abstract String Id { get; } public abstract IDictionary<String,RoleInstanceEndpoint> InstanceEndpoints { get; } public abstract Role Role { get; } public abstract Int32 UpdateDomain { get; } } FaultDomain and UpdateDomain specify respectively the fault domain and upgrade domain for the instance. Role identifies the role and Id uniquely identifies the instance of the role. InstanceEndpoints is an IDictionary<> linking the name of each instance endpoint specified in the Service Definition file with the actual definition of the RoleInstanceEndpoint. Note that each instance of a role has distinct actual RoleInstanceEndpoint for each specific instance endpoint defined in the Service Definition file. ROLEINSTANCEENDPOINT The RoleInstanceEndpoint class represents an input endpoint or internal endpoint associated with an instance. It has two properties: RoleInstance identifying the instance associated with the endpoint; and IPEndpoint containing the local IP address of the instance and the port number for the endpoint. LOCALRESOURCE LocalResource represents the local storage, on the file system of the instance, defined for the role in the Service Definition file. Each instance has its own local storage that is not accessible from other instances. LocalResource exposes three read-only properties: Name uniquely identifying the local storage; MaximumSizeInMegabytes specifying the maximum amount of space available; RootPath specifying the root path of the local storage in the local file system. 78
  • 79. The Windows Azure Platform: Articles from the Trenches CHAPTER 4: SQL AZURE CONNECTING TO SQL AZURE IN 5 MINUTES By Juliën Hanssens "Put your data in the cloud!" Think about it… no more client side database deployment, no more configuring of servers, yet with your data mirrored and still accessible using comfortable familiarities for SQL Server developers. That’s SQL Azure. In this article we will quickly boost you up to speed on how to get started with your own SQL Azure instance in less than five minutes! PREREQUISITE – GET A SQL AZURE ACCOUNT Let’s assume you already have a SQL Azure account. If not, you’re free to try one of the special offers that Microsoft has available on[1] like the free-of-charge Introductory Special or the offer that is available for MSDN Premium subscribers. WORKING WITH THE SQL AZURE PORTAL With a SQL Azure account at your disposal, you first need to login to the SQL Azure Portal[2]. This is your dashboard for managing your own server instances. The first time you login to the SQL Azure Portal, and after first accepting the Terms of Use, you will be asked to create a server instance for SQL Azure like the screenshot below illustrates: 1: Create a server through the SQL Azure Portal Providing a username and password is pretty straight forward. Do notice that these credentials will be the equivalent of your “sa” SQL Server account, for which logically strong password rules apply. And certain user names are not allowed for security reasons. With the location option you can select the physical location of the datacenter at which your server instance will be hosted. It is advisable to select the geographical location nearest to your – or your users - needs. 79
  • 80. The Windows Azure Platform: Articles from the Trenches Once you press the Create Server button it takes a second or two to initialize your fresh, new server and you’ll be redirected to the Server Administration subsection. Congratulations, you’ve just performed a “SQL Server installation in the cloud”! CREATE A DATABASE THROUGH THE SERVER ADMINISTRATION Whilst still in the SQL Azure Portal[2] Server Administration section our server details are list, like the name used for the connection string, and a list of databases. The latter is, by default, only populated with a 'master' database. Exactly like SQL Server this specific database contains the system-level information, such as system configuration settings and logon accounts. We are going to leave the master database untouched and create a new database by pressing the Create Database button. 2: Create a database through the SQL Azure Portal On confirmation the database will be created in the “blink of an eye”. But for those who find this too convenient you can achieve the same result using a slim script like: CREATE DATABASE SqlAzureSandbox GO However, in order to be able to feed our database some scripts we need to set security and get our hands on a management tool. And for the latter why not use the tool we have used since day and age to connect to our “regular” SQL Server instances: SQL Server Management Studio R2 (SSMS). CONFIGURING THE FIREWALL By default you initially cannot connect to SQL Azure with tools like SSMS. At least, not until you explicitly tell your SQL Azure instance that you want a specific IP address to allow connectivity with pretty much all administrative privileges. 80
  • 81. The Windows Azure Platform: Articles from the Trenches To enable connectivity, add a rule by entering your public IP address in the Firewall Settings tab on your SQL Azure Portal. 3: Add a firewall rule through the SQL Azure Portal’s Server Administration Do notice the “Allow Microsoft Services access to this server” checkbox. By enabling this you allow other Windows Azure services to access your server instance. CONNECTING USING SQL SERVER MANAGEMENT STUDIO Having set up everything required for enabling SSMS to manage the database, let’s start using it. If you haven’t done so already, install the latest R2 release of the SSMS[3] first. Older versions will just bore you with annoying error messages, so don’t waste time on that. Once in place, boot up the 81
  • 82. The Windows Azure Platform: Articles from the Trenches SSMS application, enter the full server name and authenticate using the provided credentials. 4: Connecting SQL Server Management Studio to your SQL Azure instance No rocket science there either. Optionally you can provide a specific database instance to connect to in the Options section (more on that later). Once connected, you have a pretty similar environment with SSMS on SQL Azure as you have on a ‘regular’ SQL Server instance. Although keep in mind that with the current installment you have to do without the comfortable dialog boxes. This means you need to brush up your skills with T-SQL. SQL Azure offers a subset, albeit significant subset, of the familiar T-SQL features and commands you are used to using with SQL Server. This is due to the fact that SQL Azure is designed natively for the Windows Azure platform In a nutshell this means that the creation of tables, views, logins, stored procedures etc. by using scripts is roughly the same in T-SQL syntax but only lacks certain (optional) parameters. Let’s demonstrate this by creating an arbitrary table. In SSMS right click on the Tables section of our SqlAzureSandbox database and select “New Table”. The result will be no dialog box with fancy fields, but a basic SQL script for us to edit. Once modified, it doesn’t really differ from your average SQL Server script. For example: -- ========================================= -- Create table template SQL Azure Database -- ========================================= IF OBJECT_ID('[dbo].[Beer]', 'U') IS NOT NULL DROP TABLE [dbo].[Beer] GO CREATE TABLE [dbo].[Beer] ( [Id] int NOT NULL, [BeerName] nvarchar(50) NULL, [CountryOfOrigin] nvarchar(50) NULL, [AlcoholPercentage] int NULL, [DateAdded] datetime NOT NULL, 82
  • 83. The Windows Azure Platform: Articles from the Trenches CONSTRAINT [PK_Beer] PRIMARY KEY CLUSTERED ( [Id] ASC ) ) GO Once executed the table is generated. This is one thing to take notice off: tables have to be created through SSMS by default. But once they’re available you can simply boot up Visual Studio and use the Server Explorer to access them in your project. In fact, you can even use familiar tools with design-time support like LINQ to SQL, ADO.NET DataSets or Entity Framework for even more productivity. APPLICATION CREDENTIALS Last but not least, a recommendation on security. Up until now we have used our godlike master credentials for managing our database. We really don’t want these credentials to be included in our application, so let’s create a lightweight custom user/login for our application to use: -- 1. Create a login CREATE LOGIN [ApplicationLogin] WITH PASSWORD = 'I@mR00tB33r' GO -- 2. Create a user CREATE USER [MyBeerApplication] FOR LOGIN [ApplicationLogin] WITH DEFAULT_SCHEMA = [db_datareader] GO -- 3. And grant it access permissions GRANT CONNECT TO [MyBeerApplication] GO KEEP IN MIND – THE TARGET DATABASE As you may have noticed all samples lack the USE statement, i.e. “USE *SqlAzureSandbox+”. This is because the USE <database> command is not supported. With SQL Azure you should keep in mind that each database can be on a different server and therefore requires a separate connection. With SSMS you can easily achieve this in the options of the Connect to Server dialog box: 83
  • 84. The Windows Azure Platform: Articles from the Trenches 5: Connect to a specific database using SQL Server Management Studio Take notice of this when you are frequently switching between databases. And with that in mind, the sky is the limit. Even in the cloud. 1. Microsoft Windows Azure Platform 2. SQL Azure Portal 3. SQL Server 2008 R2 Management Studio Express (SSMSE) 84
  • 85. The Windows Azure Platform: Articles from the Trenches CHAPTER 5: WINDOWS AZURE PLATFORM APPFABRIC REAL TIME TRACING OF AZURE ROLES FROM YOUR DESKTOP By Richard Prodger One of the big challenges faced with a deployed Azure hosted role is how to get access to tracing information. Well, you can use the Azure Diagnostics to collect data in table storage but this is far from ideal as you have to read the data out and that doesn’t give you real time information. There is a better way! The .NET Framework already provides the TraceListener that most of you will be familiar with. By creating your own custom TraceListener, you can push trace messages anywhere you like. Then, by using the magic provided by the service bus for traversing firewalls, you can pick up these trace messages in an application running on your desktop. CUSTOM TRACE LISTENER We need a client to send the messages and a server to receive them. Let’s start with the Azure client. The first thing to do is implement the custom TraceListener: public class AzureTraceListener : TraceListener { ITrace traceChannel; public AzureTraceListener(string serviceNamespace, string servicePath, string issuerName, string issuerSecret) { // Create the endpoint address for the service bus Uri serviceUri = ServiceBusEnvironment.CreateServiceUri("sb", serviceNamespace, servicePath); EndpointAddress endPoint = new EndpointAddress(serviceUri); // Setup the authentication TransportClientEndpointBehavior credentials = new TransportClientEndpointBehavior(); credentials.CredentialType = TransportClientCredentialType.SharedSecret; credentials.Credentials.SharedSecret.IssuerName = issuerName; credentials.Credentials.SharedSecret.IssuerSecret = issuerSecret; // Create the channel and open it ChannelFactory<ITrace> channelFactory = new ChannelFactory<ITrace>(new NetEventRelayBinding(), endPoint); channelFactory.Endpoint.Behaviors.Add(credentials); traceChannel = channelFactory.CreateChannel(); } public override void WriteLine(string message) { traceChannel.WriteLine(message); } public override void Write(string message) { traceChannel.Write(message); 85
  • 86. The Windows Azure Platform: Articles from the Trenches } } As you can see, there is some setup stuff for WCF and the service bus, but basically all you have to do is override the Write and WriteLineMethods. The ITrace interface is simple as well: [ServiceContract] public interface ITrace { [OperationContract(IsOneWay=true)] void WriteLine(string text); [OperationContract(IsOneWay = true)] void Write(string text); } SEND MESSAGE CONSOLE APPLICATION Now we need an app to send the messages. For the purposes of this article, I have created a simple console app, but this could be any Azure role. static void Main(string[] args) { string issuerName = "yourissuerName"; string issuerSecret = "yoursecret"; string serviceNamespace = "yourNamespace"; string servicePath = "tracer"; TraceListener traceListener = new AzureTraceListener(serviceNamespace, servicePath, issuerName, issuerSecret); Trace.Listeners.Add(traceListener); Trace.Listeners.Add(new TextWriterTraceListener(Console.Out)); while (true) { Trace.WriteLine("Hello world at " + DateTime.Now.ToString()); Thread.Sleep(1000); } } This simple app simply creates a new custom TraceListener and adds it to the TraceListener’s collection and that pushes out a timestamp every second. I’ve also added Console.Out as another listener so you can see what’s being sent. TRACE SERVICE So that’s the Azure end done, what about the desktop end? The first thing you have to do is implement the TraceService that the custom listener will call: public class TraceService : ITrace { public static event ReceivedMessageEventHandler RecievedMessageEvent; void ITrace.WriteLine(string text) { 86
  • 87. The Windows Azure Platform: Articles from the Trenches RecievedMessageEvent(this, text); } void ITrace.Write(string text) { RecievedMessageEvent(this, text); } } public delegate void ReceivedMessageEventHandler(object sender, string message); The event delegate is there to push out the messages to the app hosting this class. SERVICE HOST CLASS Next, we have to create the class that will host the service: public class AzureTraceReceiver { ServiceHost serviceHost; public AzureTraceReceiver (string serviceNamespace, string servicePath, string issuerName, string issuerSecret) { // Create the endpoint address for the service bus Uri serviceUri = ServiceBusEnvironment.CreateServiceUri("sb", serviceNamespace, servicePath); EndpointAddress endPoint = new EndpointAddress(serviceUri); // Setup the authentication TransportClientEndpointBehavior credentials = new TransportClientEndpointBehavior(); credentials.CredentialType = TransportClientCredentialType.SharedSecret; credentials.Credentials.SharedSecret.IssuerName = issuerName; credentials.Credentials.SharedSecret.IssuerSecret = issuerSecret; serviceHost = new ServiceHost(typeof(TraceService)); ServiceEndpoint endpoint = serviceHost.AddServiceEndpoint(typeof(ITrace), new NetEventRelayBinding(), serviceUri); endpoint.Behaviors.Add(credentials); } public void Start() { serviceHost.Open(); } public void Stop() { serviceHost.Close(); } } This is basic WCF code, nothing special here. All we do is create some credentials for authenticating with the service bus, create an endpoint, add the credentials and start up the service host. 87
  • 88. The Windows Azure Platform: Articles from the Trenches SERVICE Now all we have to do is implement the desktop app. Again, for simplicity, I am creating a simple console app: static void Main(string[] args) { Console.Write("AZURE Trace Listener Sample started.nRegistering with Service Bus..."); string issuerName = "yourissuerName"; string issuerSecret = "yoursecret"; string serviceNamespace = "yourNamespace"; string servicePath = "tracer"; // Start up the receiver AzureTraceReceiver receiver = new AzureTraceReceiver(serviceNamespace, servicePath, issuerName, issuerSecret); receiver.Start(); // Hook up the event handler for incoming messages TraceService.RecievedMessageEvent += new ReceivedMessageEventHandler(TraceService_myEvent); // Now, just hang around and wait! Console.WriteLine("DONEnWaiting for trace messages..."); string input = Console.ReadLine(); receiver.Stop(); } static void TraceService_myEvent(object sender, string message) { Console.WriteLine(message); } This app simply instantiates the receiver class and starts the service host. An event handler is registered and then just waits for messages. When the client sends a trace message the event handler fires and the message is written to the console. You may have noticed that I have used the NetEventRelayBinding for the service bus. This was deliberate as it allows you to hook up multiple server ends to receive the messages in a classic pub/sub pattern. This means you can run multiple instances of this server on multiple machines and they all receive the same messages. You can use other bindings if required. Another advantage of this binding is that you don't have to have any apps listening, but bear in mind you will be charged for the connection whether you are listening or not, although you won’t have to pay for the outbound bandwidth. I put all the WCF and service bus setup into the code, but this could easily be placed into a configuration file. I prefer it this way as I have a blind spot when it comes to reading WCF config in xml and I always get it wrong, but it does mean you can’t change the bindings without recompiling. SUMMARY There is more that could be done in the TraceListener class to improve thread safety, error handling and to ensure that the service bus channel is available when you want to use it, but I’ll leave that up to you. This 88
  • 89. The Windows Azure Platform: Articles from the Trenches code was first put together whilst the AppFabric ServiceBus was in private beta. Microsoft has now included a version of this code in their SDK samples, so take a look there. So that's it. You now have the ability to monitor your Azure roles from anywhere. 89
  • 90. The Windows Azure Platform: Articles from the Trenches MEET THE AUTHORS ERIC NELSON After many years of developing on UNIX/RDBMS (and being able to get mortgages) Eric joined Microsoft in 1996 as a Technical Evangelist (and stopped being able to get mortgages due to his new 'unusual job title' in the words of his bank manager). He has spent most of his time working with ISVs to help them architect solutions which make use of the latest Microsoft technologies - from the beta of ASP 1.0 through to ASP.NET, from MTS to WCF/WF and from the beta of SQL Server 6.5 through to SQL Server 2008. Along the way he has met lots of smart and fun developers - and been completely stumped by many of their questions! In July 2008 he switched role from an Application Architect to a Developer Evangelist in the Developer and Platform Group. Developer Evangelist, Microsoft UK Developer Evangelist, Microsoft UK Website: Email: Blog: Twitter: MARCUS TILLETT Marcus Tillett is currently the Head of Technology at Dot Net Solutions, where he currently heads the technical team of architects and developers. Having been building solutions with Microsoft technologies for more than 10 years, his expertise is in software architecture and application development. He is passionate about understanding and using the latest cutting- edge technology. He is author of “Thinking of... Delivering Solutions on the Windows Azure Platform?” ( Head of Technology at Dot Net Solutions Twitter: @drmarcustillett Blog: 90
  • 91. The Windows Azure Platform: Articles from the Trenches RICHARD PRODGER Richard Prodger is a founding Technical Director of Active Web Solutions with more than 25 years experience in the R&D and computing sectors. Richard’s primary responsibilities are technical strategy and systems development. Richard is the Director responsible for the AWS Technology Centre. Prior to joining AWS, Richard managed BT's Web Services unit. At BT, Richard was responsible for implementing large scale e-commerce and web based systems and for translating emerging technology into practical business solutions. Richard was the principal architect and technical design authority for the multi-award winning RNLI Sea Safety system. More recently, Richard has been working closely with Microsoft on their cloud services platform, Windows Azure. Technical Director of Active Web Solutions SAKSHAM GAUTAM Saksham Gautam started working with Windows Azure right from the early stages of the development of the platform. He is an MCTS on WCF, and he graduated with Bachelors in Computer Science in 2007. Since then, he has been working as a Software Developer for AWS. Saksham is one of the architects and the lead developer for porting the existing on-premise sea safety system to Windows Azure. Apart from Azure and .NET, he is interested in distributed systems composed of heterogeneous components. He presented his work on interoperability in Windows Azure at the Architect Insight Conference 2010. He is currently based in Prague and builds interesting software, particularly using C#.NET. Software Developer/Architect, Active Web Solutions Twitter: @sakshamgautam Blog: 91
  • 92. The Windows Azure Platform: Articles from the Trenches STEVE TOWLER Steve Towler is a Senior Software Developer for Active Web Solutions in Ipswich and has been working with Windows Azure since April 2009. In that time he has helped develop a number of applications hosted in Windows Azure including a CAD drawing collaboration tool and a location based services application. Steve has also been conducting a number of Azure Assessment Days in conjunction with Microsoft and promoting the benefits of cloud computing. Senior Software Developer, Active Web Solutions Blog: ROB BLACKWELL Rob Blackwell is R&D Director at Ipswich based Active Web Solutions. He was part of a team that won an unprecedented three British Computer Society awards in 2006 and was a Microsoft Most Valuable Professional (MVP) in 2007 and 2008. Rob is a self-confessed language nerd and freely admits that the real reason he’s interested in running Java on Azure is so that he can host his spare-time Clojure Lisp experiments. R&D Director, Active Web Solutions Blog: Twitter: JULIËN HANSSENS Juliën Hanssens is a Software Engineer and Technical Consultant in software technologies at Securancy Intelligence, a Dutch IT company. He can be contacted at Software Engineer at Securancy Intelligence Email: 92
  • 93. The Windows Azure Platform: Articles from the Trenches SIMON MUNRO Simon Munro, a senior consultant at London-based EMC Consulting, has been designing and developing commercial applications for two decades. Despite this, he still has a deep- rooted need to write production code every day. Branded as a thought-leader, Simon enjoys stirring things up and pushing conformity by challenging acceptable norms and asking difficult questions. His current endeavors include assisting developers and customers understand the underlying architectural concepts around cloud computing. Senior consultant at EMC Consulting Blog: Twitter: @simonmunro SARANG KULKARNI Sarang is an Analyst Programmer with Accenture-Avanade during work hours and a technology nomad after that. He has been coding for food and gadgets for the past 8 years around all things Microsoft including ATL/COM, VB6, .Net Framework 2.0 onwards, Winforms, WCF, ASP.Net, WIF and now Windows Azure; targeting varied assignments ranging from run off the mill enterprise LOB applications to Astrometry APIs and media transcoding solutions in the cloud. He dwells at Pune, India with daughter Saee and wife Prajakta. Analyst Programmer with Accenture-Avanade Blog: Email: STEVEN NAGY By day, Steven is a .Net consultant who likes diving deep into the technologies he is passionate about, and has been learning, teaching, and presenting on Azure since its first public release at PDC 08. By night he cackles gleefully basking in the glow of his laptop screen as thousands of Azure worker roles carry out his evil bidding. .NET Consultant Blog: Twitter: snagy 93
  • 94. The Windows Azure Platform: Articles from the Trenches GRACE MOLLISON Grace’s role as a Platform Architect at EMC bridges the gap between Infrastructure and Development. Activities range from supporting the development teams throughout the development life cycle, liaising between the client, 3rd parties ( e.g Hosting partners) and EMC Consulting as required. Advising on and architecting the platform. Grace has a lot of enthusiasm for Public cloud solutions and has been dabbling with Azure from early betas. Grace was part of the team that developed the’ See The Difference ‘solution which was built using Windows Azure and SQL Azure. Grace is a CISSP (Certified Information Systems Security Professional). Grace joined EMC in 2008 from Hogg Robinson where she was responsible for the design, implementation and ongoing maintenance, support and evolution of their eCommerce platform which has BizTalk at its core. Platform Architect at EMC Blog: JASON NAPPI Jason is a Software Architect at SmartPak, where he advances their eCommerce engine and line-of- business applications. He has 14 years experience as a developer mainly on the Microsoft stack, developing applications beginning with VB 6, MTS/COM+, ASP, through each successive version of the .NET framework and even picked up a certification along the way. He’s held roles in variety of industries in the Boston area including, health care, web hosting, financial services, and eCommerce. Most recently Jason’s been struggling to keep pace with the Entity Framework, Silverlight, Azure, ASP.NET MVC, WCF Rest, WCF Data Services, and the myriad of other technologies pouring out of Redmond and elsewhere. Software Architect, SmartPak Blog: 94
  • 95. The Windows Azure Platform: Articles from the Trenches JOSH TUCHOLSKI Josh works for Rosetta as a Senior Technology Associate in the Microsoft Solution Center where he takes part in helping Rosetta deliver interactive marketing solutions to clients in the financial, ecommerce, B2B, and healthcare sectors. He has experience working with small teams to large enterprise environments and focuses on WCF service development, RIA apps, and middle-tier component integration. He strives to produce simple solutions that create sound technical architectures and can tell a great story at the end of the day. Outside of development, he enjoys meeting with students in computer science and software engineering to pick their brain and help them prepare for their professional careers. Josh lives in Ohio with his wife Andrea. Senior Technology Associate LinkedIn: Twitter: Blog: DAVID GRISTWOOD Ever since he wrote his first ‘10 ? “hello world” : goto 10‘ program on a PET computer in the late 70s he has been hooked, and has worked with computers ever since. During his career, David has secured a Distinction in Computing Science at Newcastle University, worked as a freelance computer journalist, visiting lecturer in Computer Science, a director of a software company, as well has having designed and developed a wide range of software, and computer systems. For the last 15 years David has worked at Microsoft, firstly in its fledgling consultant service section, then in EMEA as a technical evangelist. Since Microsoft’s launch of .NET, he has been focused on the .NET platform, helping design and build a wide range of systems, from smart clients to web applications, and more recently, cloud computing with the Windows Azure platform. He currently works mainly with partner and startups, and runs and delivers regular technical briefings around the Microsoft platform, including TechEd Europe, TechDays, BizSpark Camp, etc. Developer Evangelist, Microsoft UK Twitter: @scroffthebad 95
  • 96. The Windows Azure Platform: Articles from the Trenches NEIL MACKENZIE Neil Mackenzie has been programming since the late Bronze Age. He learned C++ when the only book available was written by Bjarne Stroustrup. He has been using SQL Server since v4.2.1 on Windows NT. However, he is only a recent convert to the joys of .Net Framework and C#. He has been using Windows Azure since PDC 2008 and regrets the demise of the Live Framework API. Neil spent many years working in healthcare software and is currently involved in a stealth data-analytics startup. Neil lives in San Francisco, CA having noticed the weather there is somewhat better than it was in Scotland. Blog: Twitter: @mknz MARK RENDLE Mark is currently employed as a Senior Software Architect by Dot Net Solutions Ltd, creating all manner of software on the Microsoft stack, including ASP.NET MVC, Windows Azure, WPF and Silverlight. His career in software design and development spans three decades and more programming languages than he can remember. C# has been his favourite language pretty much since the first public beta, when you had to write the code in a text editor and compile it on the command line. Those were the days. You kids today, with your IntelliSense and your ReSharpers, don’t know you’re born... Things vying for Mark’s attention lately include functional programming, internet-centric applications, the Azure cloud platform and NoSQL data stores. Senior Software Architect, Dot Net Solutions Ltd Blog: 96