Windows Azure Platform: Articles from the Trenches, Volume One

The Windows Azure
Platform: Articles from
the Trenches
Volume One
Editor and copy and paste guru: Eric Nelson and 15 authors smarter than him
22nd June 2010 (v0.9)

Cover art by Andrew Fryer

Developers have been exploring the possibilities opened up by the Windows Azure Platform for
Cloud Computing. This book pulls together great articles from many of those developers who have
been active with the Windows Azure Platform to hopefully help others become successful. There are
twenty articles in this first volume covering everything from getting started to implementing best
practices for elastic applications.

The Windows Azure Platform: Articles from the Trenches

TABLE OF CONTENTS

INTRODUCTION 6
From the Editor 6
Would you like to become an author for a future edition? 6
Introduction to the Windows Azure Platform 7
AE – Acronyms Explained  8

CHAPTER 1: GETTING STARTED 9

5 steps to getting started with Windows Azure 9
Step 1: Creating an Azure account. 9
Step 2: Provisioning a SQL Azure database 9
Step 3: Building a Web Application for Azure 10
Step 4: Packaging the Web Application for Windows Azure 11
Step 5: Deploying the Web Application to Azure. 11

The best tools for working with the Windows Azure Platform 14
Category: The usual suspects 14
Category: Windows Azure Storage 14
Category: Windows Azure diagnostics 17
Category: SQL Azure 18
Category: General Development 19

CHAPTER 2: WINDOWS AZURE PLATFORM 20

Architecting For Azure – Building Highly Scalable Applications 20
Principles of Azure Architectures 20
Partition Data 20
Colocation 21
Cache 21
State 21
Distribute Workloads Effectively 22
Maximise Resources 22
Summary 23

The Windows Azure Platform and Cost-Oriented Architecture 24
Cost is important 24
What costs to consider 24
Conclusion 25

De-risking Your First Windows Azure Project 26
Popular Risks 26
Non-Technical Tactics for Reducing Risk 27
Technical Tactics for Reducing Risk 28

2


Developer Responsibility 29

Trials & tribulations of working with Azure when there’s more than one of you 30
Development Environment 30
Test Environment 30
Certificates 31
When things go wrong 31
Summary 31

Using a Continuous Integration build to achieve an automated deployment of your latest build 32
Getting the right “bits” 32
Packaging for deployment 32
Deploying 33

Using Java with the Windows Azure Platform 35
Accessing Windows Azure Storage from Java 35
Running Java Code on Windows Azure 36
AzureRunme 37

CHAPTER 3: WINDOWS AZURE 39

Auto-Scaling Windows Azure Compute Instances 39
Introduction 39
A Basic Approach 39
The Scale Agent 39
Monitoring: Retrieving Diagnostic Information 40
Rules: Establishing When To Scale 41
Trust: Authorising For Scale 42
Scaling – The Service Management API 44
Summary 45

Building a Content-Based Router Service on Windows Azure 46

Bing Maps Tile Servers using Azure Blob Storage 49

Azure Drive 51
Guest OS 51
VHD 51
CloudDrive 52
Development Environment 53

Azure Table Service as a NoSQL database 55
Master-Detail structures 55
Dynamic schema 55
Column names as data 56
Table names as data 56

3


Summary 57

Queries and Azure Tables 58
CreateQuery<T>() 58
Contexts 59
Querying on PartitionKey and RowKey 59
Continuation 60
DataServiceQuery 60
CloudTableQuery 61

Tricks for storing time and date fields in Table Storage 64

Using Worker Roles to Implement a Distributed Cache 68
Configuring the Cache 68
Using the Distributed Cache 69

Logging, diagnostics and health monitoring of Windows Azure Applications 71
Collecting diagnostic data 71
Persisting diagnostic data 72
Analysing the diagnostic data 72
More information 73

Service Runtime in Windows Azure 74
Roles and Instances 74
Endpoints 74
Service Upgrades 74
Service Definition and Service Configuration 75
RoleEntryPoint 75
Role 76
RoleEnvironment 76
RoleInstance 77
RoleInstanceEndpoint 78
LocalResource 78

CHAPTER 4: SQL AZURE 79

Connecting to SQL Azure in 5 Minutes 79
Prerequisite – Get a SQL Azure account 79
Working with the SQL Azure Portal 79
Create a database through the Server Administration 80
Configuring the firewall 80
Connecting using SQL Server Management Studio 81
Application credentials 83
Keep in mind – the target database 83

CHAPTER 5: WINDOWS AZURE PLATFORM APPFABRIC 85

4


Real Time Tracing of Azure Roles from Your Desktop 85
Custom Trace Listener 85
Send Message Console Application 86
Trace Service 86
Service Host Class 87
Service 88
Summary 88

MEET THE AUTHORS 90
Eric Nelson 90
Marcus Tillett 90
Richard Prodger 91
Saksham Gautam 91
Steve Towler 92
Rob Blackwell 92
Juliën Hanssens 92
Simon Munro 93
Sarang Kulkarni 93
Steven Nagy 93
Grace Mollison 94
Jason Nappi 94
Josh Tucholski 95
David Gristwood 95
Neil Mackenzie 96
Mark Rendle 96

5


INTRODUCTION

FROM THE EDITOR
Hello all,

The Windows Azure Platform is changing the way we architect,
implement, deploy and manage solutions. In early 2010 it went live
and in the first six months we have already seen an impressively
diverse range of solutions developed to take advantage of the
services offered.

This book pulls together great articles from many of those
developers who have been active with the Windows Azure Platform
to hopefully help others be successful. There are twenty articles in
this first volume covering everything from getting started to
implementing best practices for elastic applications. You are not
expected to read it in order from start to finish. Instead I would
encourage you to head straight to the chapters or the individual
articles that look most relevant or interesting.

The book was put together in May and early June 2010 which means
that it pre-dates the 1.2 release of the Windows Azure SDK. The 1.2
released adds some great new features, especially for Visual Studio
2010 and .NET Framework 4.0 in areas such as debugging and IDE
integration. Volume Two of this book will cover off those new
features (and more!)

Once you have had a chance to look at the articles please give us
your feedback at http://bit.ly/azureebook1feedback (It should take
less than one minute). Thank you and happy reading.

Eric Nelson
Developer Evangelist, Microsoft UK
Website: http://www.ericnelson.co.uk
Email: eric.nelson@microsoft.com
Blog: http://geekswithblogs.net/iupdateable
Twitter: http://twitter.com/ericnel

WOULD YOU LIKE TO BECOME AN AUTHOR FOR A FUTURE EDITION?

Developers value the sharing of best practices, knowledge and experiences – knowledge and
experiences such as your own. If you have insight into the Windows Azure Platform then you are a
great candidate for becoming an author involved in the next volume of this book as the Windows
Azure Platform continues to evolve and broaden.

Please email me (eric.nelson@microsoft.com) with your proposed article(s) and if possible a “sample of
your work” such as a link to your blog.

6


INTRODUCTION TO THE WINDOWS AZURE PLATFORM

The Windows Azure Platform contains three technologies which can be used individually or
together to build solutions which run “in the cloud”. For the first time you are able to run your code
and store your data in Microsoft datacenters and let Microsoft take on some of the responsibility for
keeping your solution running great and able to respond to the changing demands of business.
Solutions can either run entirely on the Windows Azure Platform or as a hybrid, with some of the
solution running on-premise or elsewhere on the Internet. The three key technologies are Windows
Azure, SQL Azure and Windows Azure Platform AppFabric:

Windows Azure

 Windows Azure is the cloud services operating system for the Windows Azure Platform.
Windows Azure provides developers with on-demand compute and storage to run your code
and store your data.
 Windows Azure supports a consistent development experience through its integration with
Visual Studio 2008 and Visual Studio 2010. Windows Azure is an open platform that supports
both Microsoft and non-Microsoft languages and technologies. Windows Azure welcomes
third-party tools and technologies such as Eclipse, Ruby, PHP, and Python.
SQL Azure
 Microsoft SQL Azure delivers the capabilities of Microsoft SQL Server to Windows Azure
applications or applications running outside of the Windows Azure Platform. It can store and
retrieve structured, semi-structured, and unstructured data with the advantage of high
availability through the storage of multiple copies of your data. It enables relational queries,
search, and data synchronization with mobile users, remote offices and business partners.

Windows Azure Platform AppFabric

 AppFabric provides secure connectivity as a service to help developers bridge cloud, on-
premise, and hosted deployments. AppFabric comprises Service Bus and Access Control.
From simple eventing scenarios to complex protocol tunneling, AppFabric Service Bus gives
developers the flexibility to choose how their applications communicate; addressing the
challenges presented by firewalls, NATs, dynamic IP, and disparate identity systems.
AppFabric Access Control enables simple, secure authorization for RESTful web services that
federate with a variety of identiy providers.

There are many articles, videos and screencasts designed to help you get up to speed with the
Windows Azure Platform and a great place to start is http://bit.ly/startazure. We also have a Getting
Started chapter within this book.

7


AE – ACRONYMS EXPLAINED 

If you are new to the Windows Azure Platform then you may need a little help with some of the
acronyms and industry terms used in this book.

 REST and RESTful - Representational State Transfer. A style of software architecture to
enable clients and servers to interact.
 WCF – Windows Communication Foundation. A technology shipped initially in .NET
Framework 3.0 to allow communication to take please between code running in different
“locations”.
 Cloud Computing – running of code and storage of data off-premise. (Also see the 100+
alternative definitions of Cloud Computing  e.g.
http://en.wikipedia.org/wiki/Cloud_computing )
 Elastic Computing –as more processing power is needed or as more data needs to be stored,
elastic computing (in our case the Windows Azure Platform) promises to rapidly respond to
those demands and provision out additional compute and storage resources.
 PaaS – Platform as a Service is one approach to Cloud Computing that favors abstraction and
simplicity over flexibility e.g. the Windows Azure Platform.
 IaaS – Infrastructure as a Service is one approach to Cloud Computing that favors flexibility
over abstraction and simplicity e.g. Amazon Web Services.
 Codename “Dallas” – a 4th member of the Windows Azure Platform, currently in CTP.
http://www.microsoft.com/WindowsAzure/dallas/
 CTP – Community Technology Preview. In simple terms – not quite as solid as a traditional
Beta 

8


CHAPTER 1: GETTING STARTED

5 STEPS TO GETTING STARTED WITH WINDOWS AZURE

By Jason Nappi

Getting started with a new technology can be daunting, but generally once you get going things
become familiar and learning accelerates. Therefore, I’d like to focus on providing a few of the basic
steps that I recently went through in the hope that it will both answer some of the basic questions
and knock down some of the barriers to accelerated learning. The following are some of the primary
design considerations for what I think of as a typical business application, and the implications of
building those same types of applications in the Azure cloud.

STEP 1: CREATING AN AZURE ACCOUNT.

The first step, as you might imagine, is to set up an Azure account. Since Windows Azure is a cloud
service, you’ll need to create an account in the cloud, and provision a cloud environment. You can
create an Azure account at the Windows Azure Developer portal. This is a pretty straightforward
registration process that will require you to create a Windows Live ID if you don’t already have one
and will require a credit card.

At the conclusion of the registration process you should have access to Windows Azure, SQL Azure
and AppFabric. At this point you haven’t created any cloud services; you’ve only created an account
under which the services you create can be provisioned and deployed.

STEP 2: PROVISIONING A SQL AZURE DATABASE

This step may not be required by everyone, but most of the applications I’ve built have been
database driven. Given that, whether creating a new application or moving an existing one to the
cloud, I think it’s going to be a fairly common question to ask where the database lives and how you
connect to it. The reasonable answer is that if my application is going to be hosted in the cloud, my
database needs to be in the cloud too.

The Windows Azure Platform provides Windows Azure Storage as well as SQL Azure for storing data.
SQL Azure is most similar to the relational databases of the typical business application, so while
Azure Storage may have scalability and cost advantages, SQL Azure provides the more familiar
paradigm. Naturally I’m inclined towards SQL Azure to get started.

In order to create my cloud database I’ll need to return to the Azure account that I set up in step 1
and navigate to the SQL Azure section of the portal https://sql.azure.com. To create a SQL Azure
server, you’ll need to provide a username and password and the SQL Azure Developer Portal will
create a server using a generated unique name similar to crkvq7vdhu.database.windows.net.

With the SQL Azure server created, you can now create the database. There is also an additional
requirement that you configure firewall rules to allow access. Again, for the sake of simplicity, you
can just grant your local machines IP address access to the SQL Azure server.

9


Lastly, you might be wondering, as I did, whether the newly created SQL Azure database is accessible
via the familiar SQL Server Management Studio Tools. I was able to successfully connect after
downloading SQL Server Management Studio 2008 R2.

STEP 3: BUILDING A WEB APPLICATION FOR AZURE

Having provisioned our cloud database and proven that you can connect to it with familiar SQL
Server Management studio tools, and assuming you’ve created the tables required by your
application, you’re ready to begin building your application. In order to do so you’ll need to install
the Windows Azure SDK and the Windows Azure Tools for Microsoft Visual Studio 1.1. The good
news about both of these is that they support Visual Studio 2008 and Visual Studio 2010.

Once you fire up Visual Studio you’ll notice a new project template for “Windows Azure Cloud
Service”. After choosing the cloud service template you will be prompted to choose from one of the
cloud service ‘roles’; Web, Worker and WCF Service Roles. Assuming you’ve chosen “ASP.NET Web

10


Role”, a solution containing two projects, a cloud services project and the familiar ASP.NET Web
project, will be created. The only real difference between a standard ASP.NET web project and the
ASP.NET Web Role project is the existence of a WebRole.cs file. The WebRole.cs serves as the entry
point for Azure.

When you hit F5 your Azure application starts up and runs inside the development Fabric. The
Development Fabric simulates the Windows Azure cloud environment enabling you to run, test and
debug Azure applications on the desktop!

STEP 4: PACKAGING THE WEB APPLICATION FOR WINDOWS AZURE

Packaging up the application for publishing to Azure turns out to be fairly simple. From within
Visual Studio you can right click on the Cloud Services project and choose Publish from the context
menu. This will package the web application into a .cspkg file, and also create the
ServiceConfiguration.cscfg file. These two files are all you need to deploy your application to
Windows Azure.

STEP 5: DEPLOYING THE WEB APPLICATION TO AZURE.

Now that you’ve packaged your ASP.NET Web Role, you’ll need to return to the Windows Azure
account you created in Step 1 and create your Windows Azure service. Under the Windows Azure
tab choose “new service””Hosted Service” and provide a name and description for your new cloud
service.

Once the Service is created there’ll be two hosted service locations, staging and production. Under
each will be a ‘Deploy’ button. Choose Deploy under Staging. This will bring up a screen asking for
the two files created in Step 4. Provide both files, and deploy. After deploying the package and the
configuration you’ll be provided with a unique url for accessing your application. Now you’ll also see
that you have the ability to ‘Run’ the service.

11


The application won’t be accessible via the url until you Run it, so press Run, and wait for it, wait for
it, wait for it…it takes a while to provision the Windows Azure infrastructure for your application, but
once you get the green light you should be good to go.

12


These are just a few of the baby steps I’ve taken to become familiar with Windows Azure. With
these steps I’ve been able to demonstrate that developing for Windows Azure is largely the same
development experience that I’m accustomed to. However, one of the more intriguing
considerations when building for Windows Azure is the potential use of Windows Azure Storage as a
data store instead the more conventional relational database provided by SQL Azure.

13


THE BEST TOOLS FOR WORKING WITH THE WINDOWS AZURE PLATFORM

By Sarang Kulkarni

“A platform is known by the tooling available around it!” Much clichéd but still holds true. Windows
Azure, though a fairly nascent cloud platform is aptly supported by some fantastic tooling which
make development fun and a developer’s life easy.

Let us get the usual suspects out of the way first to make way for some more interesting kids on the
block, many of which I cannot do without.

CATEGORY: THE USUAL SUSPECTS

Microsoft Visual Studio 2010®

Visual Studio 2010 (VS2010) is a stable development platform for Windows Azure. Though there are
very few changes specific to Azure when compared with VS2008, the overall development
experience is definitely superior. Windows Azure VMs support .Net Framework 4.0 from OS Version
1.2 and therefore it makes sense to use VS2010 to take advantage of the new features of .Net 4.0 in
the cloud. As always, the Express edition is free.

Microsoft SQL Server Management Studio® 2008 R2

The R2 release is recommended for working with SQL Azure. The biggest advantage being the
comfort of an SQL IDE we have grown up with. I don’t think I need to wax poetic about this one, this
is Bread and Butter. Again Express edition is free and recommended as it serves most of the needs.
Download it from:

http://www.microsoft.com/downloads/details.aspx?familyid=56AD557C-03E6-4369-9C1D-
E81B33D8026B&displaylang=en.

User Accounts and Local Security Policy Control Panel applets

I know there’s nothing specific to Azure here. But it comes very handy to have a user with
permissions as laid out at http://msdn.microsoft.com/en-us/library/dd573355.aspx to avoid any
surprises related to user rights while running in the fabric.

CATEGORY: WINDOWS AZURE STORAGE

What: Cerebrata - Cloud Storage Studio

Why: Cerebrata Cloud Storage Studio (CSS) is a WPF based client for managing Azure Storage, as well
as hosted applications. CSS started as a commendable effort by a small firm to provide an intuitive
visual access to the Azure Storage putting the Storage APIs to good use. It now stands as a one stop
solution to manage everything under the Azure Storage, as well as a lot of things in the hosted

14


applications.

Figure 1: Cloud Storage Studio - Connect to Azure Account

You can design a table schema in CSS, perform CRUD operations on existing tables,
download/upload table contents to/from the disk and filter table contents. Basic querying support is
also provided which supports the WCF Data Services (formally ADO.NET Data Services) query syntax.
Linq query support would have been a welcome add-on.

Blob storage is a forte of CSS and all possible operations on Blobs and Containers are available. You
can create containers, configure access policies, list blobs in a container replete with the folder
structure, upload/download page/block blobs, rename, copy and move blobs, create and view blob
snapshots (Very useful), create signed URL for a blob. MIME type configuration support is icing on
the already nice cake. My only grudge is the very basic breadcrumb while navigating the container
structure.

CSS also features a simple yet effective service management UI. The design closely resembles that of
the actual azure developer portal. The same features are offered plus a few more. The regular
service management operations like connecting to hosted services, view, deploy, delete services,
swap deployment slots, manage API certificates and manage affinity groups are available. A very
useful feature we find here is a nifty little checkbox at the bottom of the create service deployment
dialog which reads “Automatically run the deployment after creation” – a nice touch.

15


Figure 2: Cloud Storage Studio - Deploy a Service

It costs a totally worthwhile 60$ per license.

Notable alternatives are

 Cerebrata’s own CSS/e
https://onlinedemo.cerebrata.com/cerebrata.cloudstorage/default.aspx which is a
Silverlight application providing very basic but useful Storage Service administration
 the open source Azure Storage Explorer http://azurestorageexplorer.codeplex.com/
 Finally, the far from perfect yet still useful open source alternative Azure MMC Snap-in
http://code.msdn.microsoft.com/windowsazuremmc. Azure MMC in its second version and
covers almost all bases as the Cloud Storage Studio and deserves a worthy mention.

Figure 3: Windows Azure MMC

16


What: LINQPad http://www.linqpad.net/

Why: It would not be an overstatement to term LinqPad by Joseph Albahari to be the best querying
scratchpad available for Linq. LINQPad can query a varied set of data sources. Of particular interest
to this discussion are SQL Azure, WCF Data Services (Think codename “Dallas”) and Windows Azure
Table Storage. Yes Table storage! LINQPad steps in where Cloud Storage Studio stops being
adequate - the querying capabilities are superior and the interface more powerful.

Figure 4: LinqPad - Sample Query on the WADPerformanceCounters table

As usual some of the best tools come free and LinqPad surely fits the definition. There is also a pro
version available with some bells and whistles like auto-complete, Visual Studio integration etc.

CATEGORY: WINDOWS AZURE DIAGNOSTICS

What: Cerebrata – Azure Diagnostics Manager
http://www.cerebrata.com/Products/AzureDiagnosticsManager/Default.aspx

Why: Azure diagnostics has taken some time to reach the final form we see it in today. There are
few tools which provide the comfort of an Event Viewer or a comprehensive management
dashboard for working with the diagnostic data. Azure Diagnostics Manager (in public beta at the
time of writing) attempts to achieve just that.

The feature set is fairly comprehensive covering the following:

 You can either connect to an Azure storage account to read the diagnostics information and
find the deployments from there and connect to the listed deployments or choose to
connect directly to a subscription and get a list of hosted services to monitor.
 The Dashboard provides a bird’s eye view of all the diagnostic information collected. One
may choose to view Event Viewer, Trace Logs, Infrastructure Logs, Performance Counters, IIS
Logs, IIS Failed Request Logs, Crash Dumps and On Demand Transfer.

17


 If you have only deployed a service and are collecting none of these, fret not. Azure
Diagnostic monitor also provides access to the diagnostic monitor inside your Roles as well
as individual role instances through the Remote Diagnostics API. With this you can
enable/disable any of the diagnostic information being collected or you can alter the
verbosity/frequency.

Figure 5: Azure Diagnostics Manager - Performance Counter Graphs

CATEGORY: SQL AZURE

What: SQL Azure migration wizard http://sqlazuremw.codeplex.com/

Why: As most of us working with cloud solutions might have already noticed, the largest chunk of
the work coming to the System Integrators is the migration of existing applications to cloud. One of
the key aspects of this is database migration. SQL Azure migration wizard helps simplify database
migration. With the SQL Azure Migration Wizard we can analyze scripts for SQL Azure compliance,
generate scripts and can migrate databases – schema and data. Migration is supported from SQL
Server to SQL Azure, SQL Azure to SQL Server and SQL Azure to SQL Azure.

Even in its 3.2.2 version it still has its share of quirks but is vastly improved and great for the
mundane tasks in DB migration.

18


CATEGORY: GENERAL DEVELOPMENT

What: Fiddler http://www.fiddler2.com/fiddler2/

Why: Fiddler is a Web Debugging proxy. It allows us to inspect all incoming and outgoing HTTP(S)
traffic on a machine. This is particularly helpful while working with the Azure Storage, Azure Service
Management API, Remote Diagnostics Manager API and anything REST. Looking at the HTTP traffic
gives an insight into how the Requests/Responses are constructed, what Responses are received and
a host of other information that every web service developer/consumer will find handy.

Figure 6: Fiddler – Statistics

Fiddler scripting engine can be used to filter in/out requests and/or responses and also issue
preconfigured responses. Fiddler can also target specific processes to filter traffic only from those
processes.

Fiddler provides an API which can be used in a .Net application to programmatically track network
traffic and use almost all of Fiddler’s features. This has enabled some nifty Fiddler Extensions like
Watcher - A Passive Security Audit tool http://websecuritytool.codeplex.com/ , Chad Oswald’s
Request to Code http://www.chadsowald.com/software/fiddler-extension-request-to-code which
gives the required code to issue captured http requests and the JSON Viewer
http://jsonviewer.codeplex.com/ which visualizes JSON objects.

19


CHAPTER 2: WINDOWS AZURE PLATFORM

ARCHITECTING FOR AZURE – BUILDING HIGHLY SCALABLE APPLICATIONS

By Steven Nagy

Two key reasons organisations move to the cloud are to reduce cost and leverage economies of
scale. Unfortunately not every type of application is suited to the cloud, and more often than not,
those that are suited for the cloud are not architected for scalability. Further, the Windows Azure
Platform has a pricing model that if not considered during your architecture phase, can negate the
cost benefits of moving to the cloud to begin with. This article will address the key things to consider
when architecting highly scalable applications that are cost-optimised for the Azure platform.

PRINCIPLES OF AZURE ARCHITECTURES

The Windows Azure Platform already provides elasticity, redundancy, and abstractions from the
distributed platform on which it is run. This gives us a flying head start when designing systems for
the cloud, but there are still key measures we need to take to ensure our application doesn’t
become its own worst enemy. Here we define five key tenets to keep in mind throughout the design
and implementation phases of your project.

PARTITION DATA

Data partitioning is not a new concept 1. Traditionally it has helped us break up massive databases
into smaller more manageable pieces, and to improve query performance by splitting unrelated data
into different partitions. In scalable applications it is important for those same reasons, but also
allows us to scale more effectively; imagine serving 500 requests per minute on a single database
versus 50 requests per minute across 10 databases.

Furthermore, storage is cheap. Consider Sql Azure pricing versus Azure Table Storage 2 for 1Gb
storage: $10 and $0.15 per month respectively. Both are at least 3 times redundant. However not
only is Azure Table Storage cheaper, it has inbuilt partitioning mechanisms that allow you to allocate
every single entity (row) of data to a horizontal partition (or shard 3) based on the partition key you
provide. In Table Storage, each partition is a physically different storage node, which means queries
and requests can scale extremely efficiently. If you don’t have complex relational queries, this is the
ideal choice. Denormalising your data can help immensely by removing those relationships and
allowing ease of partitioning. This is essentially the premise of the ‘NoSql’ movement 4.

You should also consider data duplication for further performance increases. Consider a search
function for customers by age demographic or by city; by having two copies of the data in different

1
http://msdn.microsoft.com/en-us/library/ms190787%28v=SQL.100%29.aspx
2
http://www.microsoft.com/windowsazure/pricing/
3
http://en.wikipedia.org/wiki/Shard_%28database_architecture%29
4
http://en.wikipedia.org/wiki/Nosql

20


partitions, your query and retrieval time is highly efficient. The flip side to this approach is the added
complexity to managing multiple copies of data.

Partitioning support in Azure can be summarised as follows:

 Table entities are horizontally partitioned on partition key
 Blobs are partitioned based on their container
 Queues are partitioned on a per-queue basis
 Sql Azure supports no partitioning

Vertical partitioning is not supported by default however it makes sense to store smaller amounts of
data together when the additional fields are not needed on the majority of requests.

COLOCATION

Sql Azure, Azure Storage, Azure Compute roles, and the AppFabric all have bandwidth costs for data
moving in and out of the data centre. It makes sense to keep this in mind when building our
applications. Azure already lets us choose our data centres and more importantly, we can co-locate
components of our system via Affinity Groups such that network traversal is minimal and faster.
Luckily this is a deployment consideration and not so important with up front design.

CACHE

A more important consideration is the various opportunities to utilise caching mechanisms. There
are many ways that cache can be harnessed to minimise transactions; from end user http requests,
for underlying data stores, or memoization5 purposes. When almost everything in the platform is
accessible via a REST interface, it pays to invest effort into caching. Some cache concepts to consider
are:

 Client side timed cache – content that expires after a certain amount of time, preventing
client browsers from requesting a page, serving a local copy instead
 Entity Tags6 (ETags) - Allow you to specify a ‘version’ in a http header field; server can
indicate the version has not changed, in which case no other data is exchanged, otherwise
can return all the data for that request
 ASP.Net Page level Cache
 Distributed Cache7 - has multiple nodes that either all share the same content (shared
everything) or have unique sections of the cache (shared nothing); shared everything
distributed caches work well in Azure because of the throwaway nature of commodity
hardware and ease of scale

STATE

5
http://en.wikipedia.org/wiki/Memoization
6
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
7
http://msdn.microsoft.com/en-us/magazine/dd942840.aspx

21


State has often been cast as the enemy of concurrent programming and the same applies at higher
levels of abstraction as well, such as multiple compute instances. Mutable state requires locking and
tracking in concurrent environments, which adds overhead and complexity to applications.
Therefore reducing, or even removing state is an ideology worth pursuing.

Sometimes state is specific to a single user, such as session state. Load balancers in the Azure data
centres are round-robin, therefore as soon as you have more than one web front end you can no
longer store session state in process (default); if session state is critical to your application, look to
move it to Sql Azure or Table Storage instead. However session state is typically abused and is
generally not actually required for the situations in which it is used. As an alternative to sessions,
consider claim based security, such that any page request is accompanied by a set of claims. The
AppFabric Access Control Services can assist with this.

DISTRIBUTE WORKLOADS EFFECTIVELY

Typically when multiple sources need to access a resource there is a level of contention. Locks and
leases need to be taken and other threads are blocked until contention is resolved. As with state,
this problem exists in all forms of concurrent programming, and is as important in multi-instance
work sharing scenarios. Worker roles need to pick up items for processing, but when there are
multiple instances of the same worker role, how do we ensure that each instance does not pick up
the same work item?

The ‘Asynchronous Work Queue Pattern’ is one such solution. By providing a robust, redundant
queuing mechanism that guarantees unique distribution of work items, the workers are ignorant of
leases and locks and can focus on the job of processing work items. Such a queue will be reusable for
many different work types, and the Windows Azure Storage Queue service is an ideal candidate.

There are other messaging architectures that allow us to decouple our components. AppFabric
allows a ‘NetEventRelayBinding’ for Publish/Subscribe scenarios, for example.

MAXIMISE RESOURCES

One could argue that if your CPU is not at 100% it is being underutilised. In Azure you pay for the
core regardless of usage, so it makes sense to get the most bang for your buck.

When using worker roles, multi-threaded architectures are often forgotten. Since adding another
instance means an additional hourly cost, first ensure you are getting the most out of your current
instances. If your worker (or web role for that matter) has lots of IO work, it makes sense to use
multiple threads.

Auto-scaling resources is worth investigating also. Typically an IT department will maintain enough
servers to cope with their peak periods; consider instead starting at trough capacity, and use auto-
scaling functionality to add instances dynamically. When load starts to taper off, start scaling down,
cutting costs as you do.

22


Currently you can utilise content delivery services (CDN) to push blobs out to localised edges. This
will help improve latency for your customers. Also consider what could qualify for blob storage;
essentially anything static is a contender:

 PDF, Word documents
 Videos
 Website images
 Website CSS and JavaScript libraries
 Any static HTML website pages
 Silverlight files (XAP)

Blob storage currently allows blobs to be stored in the root container. This feature was specifically
included so that Silverlight applications running from blob storage could place a cross domain policy
file at the root of the URL namespace (a requirement for cross domain policy files).

SUMMARY

While not extensive, this article gave you a brief overview of some key principles to keep in mind
when architecting applications to run on the Windows Azure Platform. By following these guidelines
you should be able to achieve core objectives of scalability and cost recovery in the cloud.

23


THE WINDOWS AZURE PLATFORM AND COST-ORIENTED ARCHITECTURE

By Marcus Tillett

COST IS IMPORTANT

Cost-orientated development is nothing new. A low cost approach to building an application or
product is desirable but the methodology used to achieve this is not always very sophisticated.
When considering a cloud platform such as Azure the cost implications of the chosen architecture
can be significant and require a more sophisticated approach. While a traditional on premise or
hosted architecture may not consider cost as a significant factor, cost is an area that receives
significantly more focus for Azure. There are a range of costs that need to be considered; these costs
need to be considered in the context of Azure and of the end to end development and application
lifecycle management processes.

WHAT COSTS TO CONSIDER

The development process can be a significant cost
consideration for Azure. There is a continuum of development Summary
strategies for Azure; from, at one extreme, using the Azure
 Cost is much more of an
environment for development, to the other extreme, architectural consideration for
developing without any reference to Azure. There are cost Azure than for a traditional on
implications and significant other pros and cons across this premise or hosted solution.
continuum. As an example, consider the use of software  Cost implications of the chosen
factories. With a software factory that uses a strict assembly architecture can be significant.
 Costs should be considered in the
process, the cost of using the production platform may be
context of the end to end
prohibitive due to the expense of both the platform and development and application
training required. These concerns would drive a cost-oriented lifecycle management processes.
architecture where all Azure specific components are  Model costs for the chosen
abstracted from the developer or potentially replaced with architecture but most importantly
non-Azure components. While this may be an extreme test the model.
example, it does highlight one of several areas to be
considered.

Another significant topic is the methodology applied to the migration of an existing application or
the consideration for setting up data required by a new application. Migration and set up need to
include both the application and the data. The time, processes and procedures needed to transfer
large volumes of data or complex data, in particular, may be a significant undertaking. With the
potential complexity of managing changes to a live data source, the total business cost of the chosen
approach can be a critical factor.

The cost implication of the platform itself is, perhaps, the obvious area to necessitate a cost-oriented
architecture. It is natural to be drawn to, for instance, the dramatic price difference for data storage
between SQL Azure and Windows Azure storage. While this may be critical to some applications, it is
better on balance to construct a solid architecture as this will provide the best long term approach
than initially focusing on cost. This should be supported by modelling the costs of all the components

24


of the application. However, it is even more important to test this model for the most cost critical
aspects of the application. Thereby providing an understanding of how the application design and
the charging mechanisms of Azure impact the cost model. With this information the architecture can
be reviewed for significant cost savings. For any aspects that are cost critical, monitoring should be
included in the final application and used to tune the system while ensuring that the evolution of
Azure and the application are analysed for significant cost implications. Indeed monitoring the whole
system as a means to verify costs and SLA is another architectural consideration.

As a way to augment the full cost modelling process, there are some scenarios where the cost of the
platform suggests a cost-orientated architecture. One of these is multi-tenanting of an application
where there are high tenant numbers. A basic on premise or hosted server model with a pair of
servers can enable the creation of a separate IIS web site and SQL Server database for each tenant.
This model supports 10’s or perhaps 100’s of tenants for near same cost as a single tenant.
Translated the same architecture to Azure might consist a Windows Azure Web Role and a 1GB SQL
Azure database. This would equate to an approximately monthly cost of US $100 per tenant but the
cost of this Azure architecture scales linearly with tenant numbers. This is not to state that Azure is
not suitable for multi-tenanted applications, but that where cost is a critical factor for the
application a different architectural approach may be required.

CONCLUSION

Whether the considerations described here could be termed cost-driven8 or cost-oriented
architecture9,10, the terminology is less important than the realisation that cost is much more of an
architectural consideration for Azure than for a traditional on premise or hosted solution.

8
Lessons Learned: Building Multi-Tenant Applications with the Windows Azure Platform
http://microsoftpdc.com/Sessions/SVC33
9
Thinking of... Delivering Solutions on the Windows Azure Platform? http://www.amazon.co.uk/Thinking-
Delivering-Solutions-Platform-Questions/dp/0956155634/
10
Windows Azure Platform for Enterprises http://msdn.microsoft.com/en-us/magazine/ee309870.aspx

25


DE-RISKING YOUR FIRST WINDOWS AZURE PROJECT

By Simon Munro

Developer enthusiasm for building solutions based on Azure is not always shared by business. While
it is great (and perhaps obvious to us) that the cloud is ‘the way of the future’ some individuals and
organizations and vendors are ready for the change while others are not. Not all vendors have
technologies for the cloud and many businesses, products, industries and jobs will go as the cloud
wave washes them out to sea. Vendors are scrambling for attention and pushing their biased
marketing oriented opinions through the biggest dinosaurs of all – the print media, that culturally
could not even cope with the changes brought on by the Internet. Most anti-cloud and vendor
bashing opinion plays on fear and its business cousin risk, where the urge is to maintain the status
quo in our (currently) risk-averse environment. It is unsurprising then that the people that we need
to make decisions about cloud computing in our own organizations are confused, wary and reluctant
to make a commitment to our latest idea of running our solution on Azure. The term ‘the cloud’ has
become synonymous with ‘the web’ and is indistinct from ‘cloud computing’ platforms that we are
interested in – the unfortunate side effect being that the behaviour of Google, Facebook, Apple and
other web-consumer facing properties that willy-nilly change terms of service and sell personal data
for profit casts a shadow over business oriented cloud computing services.

While the dust may settle at some point in future, if we want to build a solution on Azure any time
soon, we will have to take responsibility for helping business understand the issues in order to gain
their support. While we may prefer to deal only with technical issues, the current reality is that in
most environments we have to proactively discuss the perceived risks and demonstrate that we, as
well as the Microsoft and the Windows Azure platform, are actively managing and reducing risk.

POPULAR RISKS

Risks to data are by far the most publicised because once data is in databases that are outside of an
organization’s locked-down data centre a degree of control and authority over the data is lost.
Unlike students that go and live in co-ed dorms, data does not get drunk and put pictures of itself up
on Facebook when it leaves home, but the suspicion still remains that off premise data is a high risk.
While the risk to data may increase, the actual risk, in most cases, is greatly exaggerated and
manageable.

Process related risks are also well known, centred on the involvement of other parties in the
operational aspects of the solution. No longer can business dictate service levels or even have
confidence in an external supplier of services that they may have had with their own internal IT. Like
with data, there are real issues here that have fairly complex contractual ramifications as customers
attempt to reduce vendor lock-in, guarantee service levels and maintain operational, security,
performance and other standards.

COVERT RISKS

While mainstream CIO information sources popularise some risks by extensive coverage, there are
many risks that are just as real but less well known, often due to their more technical nature.

26


The most obvious is the lack of skills and experience in creating secure, reliable and performant
cloud computing solutions. This also related to the problem of development engineering costs that
could be higher than simply throwing hardware at performance bottlenecks.

Even Microsoft, as our trusted provider of platforms and tools still has risks embedded within Azure.
The lack of on-premise alternatives to cloud technologies such as Azure tables and queues makes
the commitment to the platform quite high (a kind of vendor lock-in) and the tooling is still
immature and unable to easily support accepted engineering practices such as continuous
integration (see ‘Using a CI build to achieve an automated deployment of your latest build’ by Grace
Mollison) .

NON-TECHNICAL TACTICS FOR REDUCING RISK

While ultimate responsibility for managing risk falls to project managers and other people within the
organization, the identification of risk still remains the responsibility of everybody on the team. By
downloading this book you have more knowledge of cloud computing than many of your co-
workers, so before getting into the technical aspects, you will need to shoulder additional
responsibility and deal with some aspects of reducing risk that do not involve code.

CHOOSE THE CORRECT APPLICATION

Choose something simple that is better suited to cloud computing, such as one that is public facing
and may have demand peaks. Build on those successes before tackling applications that contain
sensitive data, integrate with a lot of other systems, are a migration of an existing legacy system or
contain a lot of traditional database storage and reporting.

ENGAGE EARLY

Even if your project is a low profile skunk works development, you need to engage with legal,
compliance, operations, finance, audit and other parts of the business sooner than usual. Normally
we would not worry about throwing up a new website onto our existing data centre, but if you
surprise people with a rogue cloud computing application it may get shot down.

UNDERSTAND THE PRICING AND OPERATIONAL MODEL

As much as it may look simple on the surface, digging deeper into the pricing, billing, SLA’s and
related aspects of cloud computing platforms can become complicated, with broad reaching impacts
on legal positions, compliance and interdepartmental feuds. You have to at least put the Azure
prices in a spreadsheet with your estimated requirements and put an annotated printout of the SLA
in your project sponsor’s hands.

UNDERSTAND THE IMPLICATIONS

While it may be unnecessary to do a full threat model, you need to understand the possible
financial, reputational and other risks if your application is compromised or the data gets lost.

27


Understanding the effects of loss should influence your approach to what data is stored on the
cloud, for how long and whether it is moved to on-premise storage.

FAMILIARISE YOURSELF WITH ON-PREMISE RISKS

Because cloud computing is seen to have security risks, the focus on security often means that the
solution is more secure than the on-premise counterparts. Whenever defending the risks of cloud
computing make sure that you compare them to the existing everyday risks of the existing on-
premise platform. Not all solutions, networks and other infrastructure can actually deliver the
availability and security that they promise.

UNDERSTAND THE APPETITE FOR RISK

Culturally, startups can absorb cloud computing risks as part of their overall risk exposure compared
more risk averse organizations such as banks that are, at least this year, less likely to absorb
additional risk. More mature organizations have processes and committees for managing risks and,
although it may ultimately be the project sponsor’s responsibility, you need to get a feel for the
ability of the organization to take on risk before you pitch your big idea.

TECHNICAL TACTICS FOR REDUCING RISK

HOW EXTREME?

Microsoft has made it quite simple to take a good ‘ol ASP.NET web application with an underlying
SQL database and throw it up onto the Azure cloud with minimal changes. On the other hand,
building a well architected solution that has been optimised for a cloud computing environment is
more difficult, involved and risky. If your system is being built within a risk averse environment and
does not need to be built for the cloud, forgo Azure storage, worker processes, federated identity
management and other cloud specific technologies and build a simple solution with web roles and a
SQL database. Azure will support you well whichever approach you choose, but you need figure out
how much on the fancy new stuff you really need and make those decisions early.

DEFINE THE APPROACH TO DATA

When it comes to cloud computing risks, data is the most sensitive and active topic and it needs to
be addressed early on in the solution design. Fortunately SQL Azure addresses many of the concerns
and risks around the NoSQL-like Azure tables by providing a familiar database platform if such
familiarity is required, but ultimately Azure storage, caching and other technologies need to be
considered in any good Azure architecture. Whatever the bias for storage in the Azure cloud, there
is still the issue that the data is in the cloud and it needs to be dealt with in your architecture. There
may be a requirement to move or copy data from Azure to an on premise database for reporting,
integration with other systems or even just the feeling that the data is safer.

MANAGE THE ENGINEERING COST

28


Unless you have built a reasonable sized application on Azure and deployed it in a live environment
there are going to be unforeseen technical challenges that will present themselves. By reading this
book you are clearly on the right track and trying to learn from the experiences of others, but you
need to do a lot more than just read or learn on the job. You need to install the tools, write code,
deploy, put it under load, scale up, scale down, debug, diagnose and try out a lot of unfamiliar
patterns and technologies just to reduce the impact of unforeseen quirks.

IMPLEMENT WITH GOOD ENGINEERING PRACTICES

The future of your first Azure application is fairly unsure – cast your mind out two years and you
cannot be sure that your architectural choices were correct, technical components have been added
or abandoned, regulations have changed or the attitudes of your organization towards cloud
computing have altered. The concerns raised by the software craftsmanship movement of
maintainability, testability, extensibility are amplified in such an environment which is years from
settling down. The Azure combination of a well established platform in the .NET ecosystem and
some new technologies, approaches and thinking thrown in means that we have both the need and
the frameworks to craft solutions properly to reduce the risk that we are exposed to. Testability,
inversion of control, loose coupling and other software craftsmanship techniques are well
supported, understood and debated on the .NET platform and are therefore (reasonably) portable
onto Azure. You need to hone these skills as single layered, monolithic architectures that seem easy
at first and are encouraged by Microsoft marketing and tooling will result in an approach with high
and unnecessary risk in an already risky space.

DEVELOPER RESPONSIBILITY

While technologists may be excited at the technical opportunities of cloud computing, business and
other decision makers are probably more wary of the cloud than any other (recent) computing
technology shift. They are reading conflicting messages by vendor marketers and self proclaimed
cloud experts while their own staff are both protecting existing jobs and whispering discord in the
passageways. So while risk management and selling of architectures may not be amongst the most
exercised developer skills, cloud computing requires that we take cloud computing to the business
and take some responsibility for allaying fears.

29


TRIALS & TRIBULATION S OF WORKING WITH AZ URE WHEN THERE’S MORE THAN ONE
OF YOU

By Grace Mollison

I had enormous fun working on an Azure project See the Difference that took 7 weeks from start of
development to handing over to the client

The technology stack used was: Windows Azure hosting, Windows Azure Storage, SQL Azure,
ASP.Net MVC, N2CMS, Spark View Engine, Castle Windsor, xVal, PostSharp

There was one bug bear in that the Azure development experience is NOT designed for a team of
developers and I needed to get that sorted out. So where did I start?

With a list of course. Here were the big ticket items:

 The ability to set up three environments Development, Testing and UAT. Testing and UAT to
be accessible by all members of the team
 Shared access to the hosted environment
 Automated deployments to the cloud as part of a CI build. After all no self-respecting
development team doesn’t have a continuous integrated build do they?

DEVELOPMENT ENVIRONMENT

For the development environment we stuck to Visual studio 2008 SP1. Visual Studio 2010 was in
beta2/ RC when we undertook the development but with all the potential unknowns with Azure that
was a step too far. The Azure developer tools were installed on each developer workstation and the
Azure SDK on the build server. There was an upgrade to the Azure SDK during the development cycle
which the development team said was needed which meant updating the various machines that
constituted the environment manually ( Alas no WSUS  ) . Fortunately this only happened once
during the development cycle.

In addition to Visual studio we also supplemented the development environments with a few extra
tools that provided a more complete development experience.

TEST ENVIRONMENT

The Test environment proved to be more challenging. The most pragmatic way to sort it out was to
provision another development work station running the development fabric. But (yes I know
there’s always a “but”) the Development fabric runs against the local loopback address. To get round
this a SSH tunnel had to be set up between the target machine and the Client machines that needed
to access it. Alas this proved to be slightly less than user friendly plus the fact the random allocation
of ports for the local storage fabric had to be resolved after each new deployment made it basically
unworkable. The differences between the Development fabric and Azure fabric was also impacting
the team deliverables as we ended up seeing differences in behavior or could only test certain
functionality in the staging environment. We resorted to using Azure Staging as our Test
environment.

30


I was anticipating an easy ride from here on but.... yes it’s another of those “Buts”.

CERTIFICATES

The team members needed to either use their own self signed certificate or to use a certificate I
generated which is then uploaded onto Azure. As the team was small and fluid the decision went
with using one I generated. This turned out to be a good call as we did have problems with
certificate connections apparently timing out after some time for some team members for no
obvious reason. Because there was only one certificate to worry about it was relatively painless to
resolve the problems around the use of this. It is bad practise to share certificates in this way but
pragmatism was the order of the day. For a larger team with a longer development cycle I would
advocate each developer using a personal certificate which can then be easily revoked.

One thing we quickly learnt was that in the early days of development, suspending, then deleting
was the safest approach to deploying a new package. The small team meant it was easy to
communicate the change of URL this caused.

WHEN THINGS GO WRONG

It’s a fairly nerve racking experience when things go wrong as often you can do nothing but wait for
Azure to barf and throw a Dr Watson and there’s no real feedback when Azure tries to spin up the
roles.

Alas as soon as we got to UAT we then had to give up our staging environment and minimise
changes to the Staging URL as both the client and a 3rd party needed to know the URL. The loss of
this environment for system testing meant we were forced to press my personal Azure account into
service as the Staging environment.

We did get the automated deployment in place but it’s a tale too long to describe in this article.

SUMMARY

The Windows Azure Platform may not be quite ready for team development out of the box but
once you understand what needs to be addressed the barriers for team development are easily
overcome . You can with a small amount of work up front treat development for the Windows
Azure Platform as you would any other application developed using your familiar team
development tools.

31


USING A CONTINUOUS INTEGRATION BUILD TO ACHIEVE AN AUTOMATED DEPLOYMENT
OF YOUR LATEST BUILD

By Grace Mollison

This article assumes familiarity with Team Foundation build and MSBuild concepts such as tasks and
properties

Setting up a Continuous Integration (CI) build to automatically push a successfully built package
directly to Azure cannot be achieved straight out the box but requires some additional work. This
article outlines an approach taken whilst delivering the See the Difference project using the
Windows Azure Platform.

GETTING THE RIGHT “BITS”

The first thing that was done was to collate and configure the components that would be needed to
allow the build server to access the Target Window Azure portal via a command line.

To do this requires using the Azure Service Management API. Using the API requires an x.509
certificate. I created a self-signed one using the makecert tool which is part of the windows SDK. An
example on how to do this is shown below:

"c:Program FilesMicrosoft SDKsWindowsv6.0Abinmakecert" -r -pe -a sha1 -n
"CN=Windows Azure Authentication Certificate" -ss My -len 2048 -sp "Microsoft
Enhanced RSA and AES Cryptographic Provider" -sy 24 MySelfSignedCert.cer.

The blog post Creating and using Self Signed Certificates for use with Azure Service Management API
explains in detail how to configure the certificate on the target Azure portal and the machine that
needs to communicate with the portal.

I downloaded the Windows Azure Service Management PowerShell CmdLets and also the Windows
Azure Service Management API Tool which are both handy for remotely accessing the Azure portal
via the Service Management API. At this stage I had no idea which one I would be using. I tried them
both as part of a Build and found that I preferred using the service management API tool csmanage
(despite being a big fan of Powershell). The blog post referred to above illustrates the use of the
x.509 certificate, the API and Powershell to deploy to the Azure staging environment.

PACKAGING FOR DEPLOYMENT

Next I looked at packaging the application ready for deployment. There are two key things when
packaging the application from the command line :

1. Obtain the role types and names as this will be needed to construct the package
2. Make sure the location of the service definition file is known

The ServiceDefintion.csdef file contains the role types and names as this is needed to construct the
package using the Windows Azure command line tool cspack. Below is a snippet from a
ServiceDefintion.csdef file illustrating a simple example with one web role. The number of instances
does not matter to cspack :

32


<ServiceDefinition name="SeeTheDifference.Cloud"
xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition">

<WebRole name="SeeTheDifference.Web" enableNativeCodeExecution="true">
<InputEndpoints>

If cspack is not run from the correct place the package will not be constructed correctly hence why
the location of the ServiceDefintion.csdef file is so important.

DEPLOYING

At this stage I was able to package the application and deploy to the Azure portal via MSBuild. We
had concerns with this approach with regards problems with the actual package affecting the
deployment. In particular we were concerned about what to do after handover to the client when a
little more caution would be called for. A change of plan was decided upon.

The new plan was to push the package to blob storage and then the Client would be able to carry
out the deployment at their convenience.

To push the package to blob storage a C# console application I called LoadBlob was written that
could be called from the MSBuild script. This application pushed the package to a pre-determined
container.

It was decided that storing the configuration (.csfg) file in blob storage was also a good idea as it
would reduce the risk of non production configuration settings being used. During testing I was
unable to get the service management API to use the stored configuration file. It was only able to
use one stored on the local system, but as the end to end deployment process we were
implementing actually required a pause for breath before the push to Azure Staging or production
this issue did not affect the implementation of the CI build process.

Finally after testing all the constituent parts, they were incorporated as part of the CI build.

Below is a snippet from a TFSbuild.proj file where I overrode the target AfteDropBuild.

The AfterDropBuild task is called after dropping the built binaries and I used it to insert some
commands to allow the build to use cspack ( equivalent to zipping the dlls and configuration files )
to package the cloud service package which is then pushed up to blob storage ready for deploying
to staging or Production.

<PropertyGroup>

<PathToAzureTools>c:Program FilesWindows Azure
SDKv1.0bincspack.exe</PathToAzureTools>

<cPkgCmd>"$(PathToAzureTools)" SeeTheDifference.Cloud.csxServiceDefinition.csdef
/role:SeeTheDifference.Web;seeTheDifference.Cloud.csxrolesSeeTheDifference.Webapproot;See
TheDifference.Web.dll</cPkgCmd>

<LoadblobPath>c:TOOLSAzureDeployment</LoadblobPath>

<LoadBlobCmd>$(LoadblobPath)Loadblob.exe </LoadBlobCmd>

</PropertyGroup>

33

<Target Name="AfterDropBuild" DependsOnTargets="DeriveDropLocationUri" Condition="
'$(IsDesktopBuild)'!='true' ">

<Message Text=" cspack creating a package for deployment"/>

<Exec Command="$(cPkgCmd) /out:c:DropsSD_Deploy$(BuildNumber).cspkg"
WorkingDirectory="c:Dropstest$(BuildNumber)ReleaseSeeTheDifference" />



<Message Text =" Copying '$(BuildNumber)'.cspkg to deployment container in Azure " />

<Exec Command ="$(LoadBlobCmd) -upload $(BuildNumber).cspkg"
WorkingDirectory="c:DropsSD_Deploy" />

</Target>

The screenshots below show the uploaded cskpg in blob storage:

The deployment could then be completed by using a user friendly tool like Cerebreta Cloud Storage
Studio.

34


USING JAVA WITH THE WINDOWS AZURE PLATFORM

By Rob Blackwell

With a name like Windows Azure, you could be forgiven for thinking that Microsoft’s cloud
computing offering is a Microsoft-only technology. In fact it has a lot to offer Java developers
through its use of open standards and RESTful APIs.

ACCESSING WINDOWS AZ URE STORAGE FROM JAVA

WindowsAzure4J is an open source library that can be used to access Windows Azure Storage from
Java applications, running on Windows Azure or elsewhere. Download the JAR file from
http://www.windowsazure4j.org/ . You’ll also need to grab some other dependencies

commons-collections-3.2.1.jar
commons-logging-1.1.1.jar
dom4j-1.6.1.jar
httpclient-4.0-beta2.jar
httpcore-4.0.jar
httpcore-nio-4.0.jar
httpmime-4.0-beta2.jar
jaxen-1.1.1.jar
log4j-1.2.9.jar

To get started, you’ll need an account name and account key from the Windows Azure portal. Paste
these into the sample code provided with WindowsAzure4j to use Blobs, Queues or Tables.If you are
an Eclipse user, you can also install the Windows Azure Tools for Eclipse
http://www.windowsazure4e.org/

35


The Windows Azure Storage Explore running in Eclipse.

RUNNING JAVA CODE ON WINDOWS AZURE

If you want to host a Java application in Windows Azure, there are a number of considerations.

The first thing to note is that even if your Java application is a Web application you probably won’t
want to use an Azure Web Role. The principle difference between web roles and worker roles is
whether Internet Information Services (IIS) is included. Most Java developers will want to use a Java
specific web server or framework, so it’s usually best to go with a worker role and include your
choice of web server within your deployment package.

You’ll also need to bootstrap Java from a small .NET program that will essentially invoke the Java
runtime through a Process.Start call.

Both web roles and worker roles are provisioned behind a load-balancer so either is suitable for
hosting web applications. In a worker role you just have to do some additional plumbing to connect
up your web server to the appropriate load-balanced Input End Point. So for example, the public
facing port 80 of yourapp.cloudapp.net might get mapped to, say port 5100 in your worker role.

The following code allows you to determine this port at runtime:

RoleEnvironment.CurrentRoleInstance.InstanceEndpoints["Http"].IPEndpoint.Po
rt

36


Fortunately both The Tomcat Solution Accelerator and AzureRunMe handle all of these
technicalities for you.

The Tomcat Solution Accelerator is a good choice if you have a traditional Java based web
application. It supports Java Servlet and Java Server Pages applications, possibly packaged as a WAR
file. It can be downloaded from http://code.msdn.microsoft.com/winazuretomcat . The accelerator
walks you through the process of creating an Azure cloud services package file that contains your
application as well as the Tomcat server and Java Runtime. It automatically handles the necessary
configuration. Just upload the resulting cspkg file to Windows Azure, wait for it to deploy then bring
up your web browser and browse to http://yourapp.cloudapp.com

AZURERUNME

AzureRunme (http://azurerunme.codeplex.com/) doesn’t assume any particular web server or
framework. In fact you could just run a straightforward command line application with no visible
user interface. That said, I’ve used it successfully with both Restlet (http://www.restlet.org/) and
Jetty (http://jetty.codehaus.org/jetty/ ).

Imagine that you were going to run your application from a USB drive and that you weren’t allowed
to install any software onto the machine – you’d have to include the Java Runtime Executive (JRE) ,
all the library JAR files and any data all in subdirectories of the USB stick. You’d probably create a
.BAT file at the top level to run everything. Like this:

cd MyApp
..jrebinjava -cp MyApp.jar;lib* Start %1

AzureRunme takes a similar approach – put all these files together in a single ZIP file, upload it to
Blob storage. Download AzureRunMe cspkg file and use this to bootstrap your Java code.

Notice that the batch file takes a parameter %1 – This is the port that you should use if you want to
bring up a web server – the load balancer will direct all HTTP traffic to your application on this port.

AzureRunme comes with a Trace listener that uses the Service Bus to relay standard output and any
log4j messages back to a command window on your desktop machine. It makes it easy to see trace
messages, watch your application’s progress and see any exception messages.

37


AzureRunMe Trace Listener showing log messages relayed via the AppFabric Service Bus.

For more information about Interoperability on the Microsoft platform see
http://www.interoperabilitybridges.com/

38


CHAPTER 3: WINDOWS AZURE

AUTO-SCALING WINDOWS AZURE COMPUTE INSTANCES

By Steven Nagy

INTRODUCTION

There are many reasons applications need to scale. Some applications have on/off periods of batch
processing (for example overnight render farms), some have predictable peak loads (for example
share market applications peak during open and close of the market) and some might have
unpredictable peak periods (for example your website gets linked by Slashdot).

In the case of predictable peak loads we can easily log in to the Windows Azure portal and adjust our
configuration file to increase the number of instances of our web and worker roles. However, when
application load peaks unexpectedly, we want our applications to respond immediately. For
applications with global reach, this might be when we least expect it. Without appropriate
monitoring techniques we may not even know the extent to which we are failing to serve requests.
On the flip side, we are paying for every CPU core hour we use. Thus we want to be able to scale
down instances that are underutilised.

We need to know how to auto-scale; our applications need to become smart.

A BASIC APPROACH

There are a number of jigsaw pieces that need to fit together to build the auto-scaling picture. The
first piece is monitoring, which lets us pull information from the roles that need to auto-scale. The
next piece is about establishing rules and measuring against thresholds to determine when to scale
up and scale down. The third piece establishes trust between the service that is doing the
monitoring (referred to from here on as the ‘Scale Agent’), and the roles that are being monitored.
Finally, the Scale Agent needs to instruct the Windows Azure Portal to add or remove instances of
those roles as it deems
necessary.
Monitoring Rules

Scale Agent

Trust Instruct

THE SCALE AGENT

The Scale Agent is responsible for monitoring your application, applying rules and instructing the API
to scale your roles, and can be hosted in different ways. One option is hosting the agent as another
process on your existing Azure roles, but a role can have many identical instances, so which instance
would it run on? And the agent will take some CPU resources, could that impact on its ability to

39


assess the other work running on the same role? It makes more sense to move the Scale Agent to a
separate location that doesn’t interfere with the standard workload, where its own workload won’t
pollute the statistics.

The agent can be hosted as another worker role, separate to the main work being done by the
application. This worker role would never need to scale, and could be geo-located and co-located
near the compute instances that it needs to monitor. This removes external bandwidth costs and
allows for faster processing/assessment.

You could also host the agent off site completely, perhaps in your own data centre, as a windows
service. This means you have more control over the agent, but the agent will be slightly slower
communicating to the instances, getting performance counter logs, and issuing scale commands.

A dedicated worker role is usually the best option but also the hardest to configure for trust as we’ll
see further on.

MONITORING: RETRIEVING DIAGNOSTIC INFORMATION

Before we can make decisions about scaling, we need to know some simple statistics about the
services we want to scale. These statistics in turn let us make informed decisions.

Diagnostic helper processes will put performance counter information into table and blob storage,
so this will require an Azure Storage project. There are lots of counters to choose from, but we
usually want to monitor memory usage, CPU usage, and number of requests per second, and if any
one of those exceeds an upper threshold then we want to scale up.

The role that needs to auto-scale will be responsible for gathering its own performance information
and dumping it into a storage table. This is done via configuration classes, available in
Microsoft.WindowsAzure.Diagnostics namespace:

var perfConfig = new PerformanceCounterConfiguration();

perfConfig.CounterSpecifier = @"Processor(0)% Processor Time";

perfConfig.SampleRate = TimeSpan.FromSeconds(5);

We create a configuration item for a performance counter we want to track – in this example we
want information about CPU utilisation. The average utilisation will be gathered over 5 second
intervals.

var config = DiagnosticMonitor.GetDefaultInitialConfiguration();

config.PerformanceCounters.DataSources.Add(perfConfig);
config.PerformanceCounters.ScheduledTransferPeriod =

TimeSpan.FromMinutes(1);

DiagnosticMonitor.Start("DiagnosticsConnectionString", config);

40


We then add the performance counter to the list of items we want the DiagnosticMonitor to track
for us. The DiagnosticMonitor runs in a separate process on the virtual machine instance so it won’t
interfere with our normal application code. Every minute new performance counter information will
be written back to a storage account as specified in the ‘DiagnosticsConnectionString’, into a table
called ‘WADPerformanceCountersTable’. We can verify the counter information made it into the
table using 3rd party tools

You can see that the table has an entity which has a property called ‘CounterValue’ which contains
our CPU utilisation.

I won’t go into the code required to view an entity in table storage; this is very well documented
already11. Your Scale Agent will retrieve these values by polling the table occasionally and keeping
track of the utilisation, scaling when needed.

RULES: ESTABLISHING WHEN TO SCALE

11
http://blogs.msdn.com/jnak/archive/2010/01/06/walkthrough-windows-azure-table-storage-nov-2009-and-later.aspx

41


The Scale Agent now knows what levels your various role instances are at based on the performance
counter information. However deciding when to scale up/down is difficult and can easily become an
exercise in advanced mathematics. Although the rules are different for every application, here are
some common issues to consider:

 You usually need a certain amount of head room, in case you get a sudden spike in load
before your Scale Agent can spin up more instances
 Immediately after scaling up, your original instances might still be over the threshold –
prevent your agent from scaling up again immediately until enough time has passed that you
can be positive that more scale is needed
 Aggregate your usage from all instances – if a single instance is spiking but the rest are under
normal load, you don’t really need to scale
 If you do need more instances, scale up based on how many instances you currently have.
For example, if you only have 5 instances, you might want to add 2 more (40% increase)
before checking again. If you have 50 you may only want to add 10 (20% increase)
 Try to predict load based on patterns of behaviour. For instance, if over the last 15 minutes
you’ve been steadily climbing by 5% utilisation per minute, you can predict that you will
probably go over your threshold in X number of minutes. Why wait until you are over loaded
and losing connections before scaling? Analysing these kinds of patterns can let you scale up
“just in time”
 Predictive patterns can get very complicated – if at 4pm every day you seem to have
additional load, prepare in advance for scale rather than waiting for auto-scale to kick in
 Keep in mind that long running requests can provide false positives – if all web threads are
used for an instance but all those threads are held up in IO requests, you will still have low
CPU utilisation, so consider a range of performance counters specific to your type of
application and architecture
 Hard limits – If your average is 3 instances, would you want your application to be allowed to
auto-scale up to 500 instances? That’s probably not a credit card bill you want to receive, so
consider imposing some hard limits to scale, or provide some reasonable alerting (SMS,
email, etc) so that if your app DOES scale to 500, you can find out immediately and hop
online to see why

TRUST: AUTHORISING FOR SCALE

There is a rich management API that can be used to control your Windows Azure projects, however
in order to issue commands there needs to be trust between the Scale Agent and the API of the
account hosting the roles – this trust is established via X509 certificates.

Generating certificates is also well documented. Once created, we need to provide our certificate in
3 places:

 The Windows Azure Account – for the Service Management API to check requests against
 The virtual machine issuing commands – in our case, where the Scale Agent is hosted
 The service configuration and definition for our Scale Agent project

42


In the Windows Azure portal for the account you wish to manage, there is an ‘Account’ tab where
you can upload DER encoded certificates with a .CER extension:

You must also upload the certificate in the Personal Information Exchange format with a .PFX
extension and the matching password to your service project so that the certificate becomes
available to any virtual machine instance provisioned from that entire project. This can be found
under the Certificates section of your service deployment:

Click on ‘Manage’ and upload the .PFX version of your certificate. It is important to note that this is
not installing the certificate to the role instances under this service. Instead it is making the
certificate available to any role that requests it. To make that request we have to complete the third
step and tell our Scale Agent role that it will require that certificate.

43


While it is possible to enter the required XML manually, it
is much easier to use the property pages instead. For the
role that needs the certificate (i.e. your Scale Agent role)
find it in your Cloud Service project, right click and select
properties. In the property pages, find the Certificates tab
on the left.

Select ‘Add Certificate’ from the top and enter the details. The important part here is finding your
certificate under the right Store Location and Name. This screen presumes the certificate is installed
locally as it uses local machine stores to search for it. If you don’t have it installed locally, you can
just paste in the thumbprint manually.

That wraps up all 3 parts of the certificate process. When your role is deployed to Windows Azure, it
will ask for the certificate with that thumbprint to be installed into the virtual machine.

SCALING – THE SERVICE MANAGEMENT API

We know we need to scale, we have established trust, all we need to do is issue the command:
scale!

All API calls are RESTful, but there is no API that exists solely for scaling up and down. Instead this is
done through the service configuration file, which is maintained separately from the service
deployment. You can at any time go and change the configuration for your deployment through the
portal, and the API is just an extension of this functionality.

The steps required are:

1. Request the configuration file for a service deployment
2. Find the XML element for the instance count on the role you are scaling
3. Make the change
4. Post the configuration file back to the service API

If you don’t want to manually manipulate the REST API yourself, Microsoft has posted code samples
to assist you, including samples on scale12 and services management API 13.

12
http://code.msdn.microsoft.com/azurescale
13
http://code.msdn.microsoft.com/windowsazuresamples

44


SUMMARY

This short article provides you with the theory to scale up your applications reactively. Scheduled
scale up/down can also be automated with the same technique defined above but instead of scaling
reactively, you can also scale proactively.

While this article has presented just one way of scaling automatically, there are other derivatives
and approaches you could follow. For example, the Scale Agent could pull diagnostic information
from the roles via the Diagnostic Manager classes, rather than the roles pushing that information.
Open source framework Lokad.Cloud14 takes another approach by allowing roles to auto-scale
themselves. Find the approach that’s right for you and capitalise on economies of scale today!

14
http://code.google.com/p/lokad-cloud/

45


BUILDING A CONTENT-BASED ROUTER SERVICE ON WINDOWS AZURE

By Josh Tucholski

Some applications, depending on their nature, require priority processing based on request content.
It is typical in these scenarios to develop an application layer to route requests from the client to a
specific business component for further processing. Implementing this in Windows Azure is not
straightforward due to its built-in load balancer. The Windows Azure load balancer only exposes a
single external endpoint that clients interact with; therefore it is necessary to know the unique IP
address of the instance that will be performing the work. IP addresses are discoverable via the
Windows Azure API when marked as internal (configured through the web role’s properties).

While this tutorial may seem more of an exercise on WCF than on Windows Azure, it is important to
understand how to perform inter-role communication without the use of queues.

In order to filter requests by content, an internal LoadBalancer class is created. This class ensures
requests are routed to live endpoints and not dead nodes. The LoadBalancer will need to account for
endpoint failure and guarantee graceful recovery by refreshing its routing table and passing requests
to other nodes capable of processing. Following is the class definition for the LoadBalancer to detect
endpoints and recover from unexpected failures that occur.

public class LoadBalancer
{
public LoadBalancer()
{
if (IsRoutingTableOutOfDate())
{
RefreshRoutingTable();
}
}

private bool IsRoutingTableOutOfDate()
{
//Retrieve all of the instances of the Worker Role
var roleInstances = RoleEnvironment.Roles["WorkerName"].Instances;

//Check current amount of instances and confirm sync with the LoadBalancer’s //record
if (roleInstances.Count() != CurrentRouters.Count())
{
return true;

46


}

foreach (RoleInstance roleInstance in roleInstances)
{
var endpoint = roleInstance.InstanceEndpoints["WorkerEndpoint"];
var ipAddress = endpoint.IPEndpoint;

if (!IsEndpointRegistered(ipAddress))
{
return true;
}
}

return false;
}

private void RefreshRoutingTable()
{
var currentInstances = RoleEnvironment.Roles["WorkerName"].Instances;

RemoveStaleEndpoints(currentInstances);
AddMissingEndpoints(currentInstances);
}

private void AddMissingEndpoints(ReadOnlyCollection<RoleInstance> currentInstances)
{
foreach (var instance in currentInstances)
{
if
(!IsEndpointRegistered(instance.InstanceEndpoints["WorkerEndpoint"].IPEndpoint
))
{
//add to the collection of endpoints the LoadBalancer is aware of
}
}
}

private void RemoveStaleEndpoints(ReadOnlyCollection<RoleInstance> currentInstances)
{
//reverse-loop so we can remove from the collection as we iterate
for (int index = CurrentRouters.Count() - 1; index >= 0; index--)
{
bool found = false;
foreach (var instance in currentInstances)
{
//determine if IP address already exists set found to true
}

if (!found)
{
//remove from collection of endpoints LoadBalancer is aware of
}
}

}

private bool IsEndpointRegistered(IPEndpoint ipEndpoint)
{
foreach (var routerEndpoint in CurrentRouters)
{
if (routerEndpoint.IpAddress == ipEndpoint.ToString())
{
return true;
}
}

return false;
}

public string GetWorkerIPAddressForContent(string contentId)
{
//Custom logic to determine an IP Address from one of the CurrentRouters
//that the load balancer is aware of
}

47


}

The LoadBalancer is capable of auto-detecting endpoints and the remaining work for the router
service is WCF. A router, by definition, must be capable of accepting and forwarding any inbound
request. The IRouterServiceContract will accept all requests with the base-level message class and
handle and reply to all actions. Its interface is as follows:

[ServiceContract(Namespace = "http://www.namespace.com/ns/2/2009", Name = "RouterServiceContract")]
public partial interface IRouterServiceContract
{
[OperationContract(Action = "*", ReplyAction = "*")]
Message ProcessMessage(Message requestMessage);
}

The implementation of the IRouterServiceContract will use the MessageBuffer class to create a copy
of the request message for further inspection (e.g. who the sender is or determining if there is a
priority associated with it). GetWorkerIPAddressForContent on the LoadBalancer is invoked and a
target endpoint is requested. Once the router has an endpoint, a ChannelFactory is initialized to
create a connection to the endpoint and the generic ProcessMessage method is invoked. Ultimately
the endpoint that the router forwards requests to will have a detailed service contract capable of
completing the message processing.
public partial class RouterService : IRouterServiceContract
{
private readonly LoadBalancer loadBalancer;
public RouterService()
{
loadBalancer = new LoadBalancer();
}

public Message ProcessMessage(Message requestMessage)
{
//Create a MessageBuffer to attain a copy of the request message for inspection
string ipAddress = loadBalancer.GetWorkerIPAddressForContent("content");
string serviceAddress = String.Format("http://{0}/Endpoint.svc/EndpointBasic",
ipAddress);

using (var factory = new ChannelFactory<IRouterServiceContract>(new
BasicHttpBinding("binding")))
{
IRouterServiceContract proxy = factory.CreateChannel(new
EndpointAddress(serviceAddress));
using (proxy as IDisposable)
{
return proxy.ProcessMessage(requestMessageCopy);
}
}
}
}

Detecting and ensuring that the endpoints are active is half the battle. The other half is determining
what partitioning scheme effectively works when filtering requests to the correct endpoint. You may
decide to implement some way of consistently ensuring a client’s requests are processed by the
same back-end component or route based on message priority. The approach outlined above also
attempts to accommodate for any disaster-related scenarios so that an uninterrupted experience
can be provided to the client. If one of the back-end components happens to shut down due to a
hardware failure, the load balancer implementation will ensure that there is another endpoint
available for processing.

48


BING MAPS TILE SERVERS USING AZURE BLOB STORAGE

By Steve Towler

Back in early 2009, I was assigned to a project where I was required to build an informational
mapping solution for a customer’s website. This mapping solution served custom tiles of the UK
which were specially commissioned for the project.

Although the map only covered the UK and we had restricted the zoom levels between 6 and 11,
each set of tiles (and there were twelve sets) had around 4500 tiles and averaged 80 megabytes in
size.

Less than 1 gigabyte of tiles may seem like a trivial figure in terms of the vast amounts of storage we
have at our disposal nowadays. But what if things had been different? What if the customer wanted
to cover Europe or even more zoom levels? What would be the bandwidth implications and the
potential costs associated with huge demand for the map?

With Windows Azure now “live”, had the same project landed on my desk today I would be looking
to serve the map tiles differently as Blob storage is ideally suited to such a task. Storage is infinitely
scalable, cheap and its RESTful interface makes requesting the tiles clean and simple.

Setting up a Bing Map tile server using Windows Azure Blob storage is surprisingly easy and you can
have your own tile server up and running in a few small steps.

First things first, you need to crunch your tiles. This is the process whereby you take you custom
map images and cut them up into tiles, ready to be used within your mapping application. There are
plenty of tutorials on how to do this out on the web and Microsoft MapCruncher is a preferred tool
for carrying this task out.

Now that you have your “crunched tiles” and you have saved them off to a directory on your local
machine, the next step is to get your tiles up into the cloud.

For ease I am going to use CloudXplorer, one of the many Windows Azure storage management
tools available on the web

Using CloudXplorer, create a public container in blob storage called tiles. Now copy all of your
“crunched tiles” from your local machine up to your newly created container in blob storage.

49


Once complete, your tiles should be publically available using a URL like:

http://myaccount.blob.core.windows.net/tiles/0313131311133.png

(or http://127.0.0.1:10000/devstoreaccount1/tiles/0313131311133.png if you
are using local development storage)

You will now be able to consume the tiles from your tile server using Bing Maps.

MSDN includes a piece of code (select JScript tab) which shows you how to add your own custom tile
layer to a Bing map. You can tweak that code to suit your own requirements but the important thing
to remember is to change the VETileSourceSpecification path to point to your new tile
server:

var tileSourceSpec = new VETileSourceSpecification("lidar",
"http://myaccount.blob.core.windows.net/tiles/%4.png");

The project I mentioned at the very beginning of this article was a success and a happy customer is
actively informing their potential customers as to their presence in the UK.

Had Windows Azure been out of CTP, how differently would the project have turned out? The
software consuming the tiles would have been the same but the infrastructure serving the tiles
would most certainly have been in the cloud.

50


AZURE DRIVE

By Neil Mackenzie

Azure Drive is a feature of Windows Azure providing access to data contained in an NTFS-formatted
virtual hard disk (VHD) persisted as a page blob in Azure Storage. A single Azure instance can mount
a page blob for read/write access as an Azure Drive. However, multiple Azure instances can mount a
snapshot of a page blob for read-only access as an Azure Drive. The Azure Storage blob lease facility
is used to prevent more than one instance at a time mounting the page blob as an Azure Drive. It is
not possible to mount an Azure Drive in an application not resident in the Azure cloud or
development fabric.

An appropriately created and formatted VHD can be uploaded into a page blob from where it can be
mounted as an Azure Drive by an instance of an Azure Service. Similarly, the page blob can be
downloaded and attached as a VHD in a local system.

The Azure SDK provides three classes in the Microsoft.WindowsAzure.StorageClient namespace to
support Azure Drives:

 CloudDrive
 CloudDriveException
 CloudStorageAccountCloudDriveExtensions

CloudDrive is a small class providing the core Azure Drive functionality. CloudDriveException allows
Azure Drive errors to be caught. CloudStorageAccountCloudDriveExtensions, similar to the
CloudStorageAccountStorageClientExtensions class, provides an extension method to
CloudStorageAccount allowing a CloudDrive object to be constructed.

GUEST OS

Azure Drive requires that the osVersion attribute in the Service Configuration file be set to WA-
GUEST-OS-1.1_201001-01 or a later version. For example:

<ServiceConfiguration
serviceName="CloudDriveExample" xmlns=
"http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration"
osVersion="WA-GUEST-OS-1.1_201001-01">

VHD

The VHD for an Azure Drive must be a fixed hard disk image formatted as a single NTFS volume. It
must be between 16MB and 1TB in size. A VHD is a single file comprising a data portion followed by
a 512 byte footer. For example, a nominally 16MB VHD occupies 0x1000200 bytes comprising
0x1000000 bytes of data and 0x200 footer bytes. When uploading a VHD it is important to
remember to upload the footer. Furthermore, since pages of a page blob are initialized to 0 it is not
necessary to upload pages in which all the bytes are 0. This could save a significant amount of time
when uploading a large VHD. The Disk Management component of the Windows Server Manager
can be used to create and format a VHD.

51


CLOUDDRIVE

A CloudDrive object can be created using either a constructor or the CreateCloudDrive extension
method to CloudStorageAccount. For example, the following creates a CloudDrive object for the VHD
contained in the page blob resource identified by the URI in cloudDriveUri:

CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse(
RoleEnvironment.GetConfigurationSettingValue("DataConnectionString"));
CloudDrive cloudDrive =
cloudStorageAccount.CreateCloudDrive(cloudDriveUri.AbsoluteUri);

Note that this creates an in-memory representation of the Azure Drive which still needs to be
mounted before it can be used.

Create() physically creates a VHD of the specified size and stores it as page blob. Note that Microsoft
charges only for initialized pages of a page blob so there should only be a minimal charge for an
empty VHD page blob even when the VHD is nominally of a large size. The Delete() method can be
used to delete the VHD page blob from Azure Storage. Snapshot() makes a snapshot of the VHD page
blob containing the VHD while CopyTo() makes a physical copy of it at the specified URL.

A VHD page blob must be mounted on an Azure instance to make its contents accessible. A VHD
page blob can be mounted on only one instance at a time. However, a VHD snapshot can be
mounted as a read-only drive on an unlimited number of instances simultaneously. A snapshot
therefore provides a convenient way to share large amounts of information among several
instances. For example, one instance could have write access to a VHD page blob while other
instances have read-only access to snapshots of it – including snapshots made periodically to ensure
the instances have up-to-date data.

Before a VHD page blob can be mounted it is necessary to allocate some read cache space in the
local storage of the instance. This is required even if caching is not going to be used.

InitializeCache() must be invoked to initialize the cache with a specific size and location. The
following shows the Azure Drive cache being initialized to the maximum size of the local storage
named CloudDrives:

public static void InitializeCache()
{
LocalResource localCache = RoleEnvironment.GetLocalResource("CloudDrives");
Char[] backSlash = { '' };
String localCachePath = localCache.RootPath.TrimEnd(backSlash);
CloudDrive.InitializeCache(localCachePath,
localCache.MaximumSizeInMegabytes);
}

The tweak in which trailing back slashes are removed from the path to the cache is a workaround for
a bug in the Storage Client library.

An instance mounts a writeable Azure Drive by invoking Mount() on a VHD page blob. The Azure
Storage Service uses the page blob leasing functionality to guarantee exclusive access to the VHD
page blob. An instance mounts a read-only Azure Drive by invoking Mount() on a VHD snapshot.
Since it is read-only, multiple instances can mount the VHD snapshot simultaneously. An instance

52


invokes the Unmount() method to release the Azure Drive and, for VHD page blobs, allow other
instances to mount the blob for write access.

The cacheSize parameter to Mount() specifies how much of the cache is dedicated to this Azure
Drive. The cacheSize should be set to 0 if caching is not desired for the drive. Different Azure Drives
mounted on the same instance can specify different cache sizes and care must be taken that the
total cache size allocated for the drives does not exceed the amount of cache available in local
storage. The options parameter takes an DriveMountOptions flag enumeration that can be used to
force the mounting of a drive – for example, when an instance has crashed while holding the lease to
a VHD page blob – or to fix the file system. Mount() returns the drive letter, or LocalPath, to the
Azure Drive - for example, "d:" - which can be used to access any path on the drive.

The following example shows an Azure Drive being mounted from a VHD page blob specified by
cloudDriveUri, before being used and then unmounted:

public void WriteToDrive( Uri cloudDriveUri )
{
CloudStorageAccount cloudStorageAccount =
CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
CloudDrive cloudDrive =
cloudStorageAccount.CreateCloudDrive(cloudDriveUri.AbsoluteUri);

String driveLetter = cloudDrive.Mount(CacheSizeInMegabytes,
DriveMountOptions.None);

String path = String.Format("{0}Pippo.txt", driveLetter);
FileStream fileStream = new FileStream(path, FileMode.OpenOrCreate);
StreamWriter streamWriter = new StreamWriter(fileStream);
streamWriter.Write("that you have but slumbered here");
streamWriter.Close();

cloudDrive.Unmount();
}

GetMountedDrives() provides access to a list of drive letters for all Azure Drives mounted in the
instance.

The DriveInfo class can be used to retrieve information about a mounted Azure Drive. The entry
point to this information is the static method DriveInfo.GetDrives() which returns an array of
DriveInfo objects representing all mounted drives on the instance.

DEVELOPMENT ENVIRONMENT

The Development Environment simulates Azure Drives in a manner that differs from their
implementation in the cloud. Furthermore, the Development Storage simulation is unaware of the
Azure Drive simulation with the consequence that the standard blob manipulation methods in the
Storage Client API do not work with the VHD page blobs and VHD snapshots used by the Azure Drive
simulation. Instead, the blob management methods in the CloudDrive class must be used.

53


The Azure Drive simulation does not mount Azure Drives from VHD page blobs but through the use
of subst against a folder (e.g. drivename) in a subfolder (e.g. drivecontainername) of a well-known
directory:

%LOCALAPPDATA%dftmpwadddevstoreaccount1

The full path to the folder that is the subst representation of the Azure Drive is:

%LOCALAPPDATA%dftmpwadddevstoreaccount1drivecontainernamedrivename

Invoking CloudDrive.Create() or CloudDrive.Snapshot() causes a folder with the name of the VHD
page blob or VHD snapshot to be created in this directory. CloudDrive.Delete() can be used to delete
the VHD page blob or VHD snapshot. Note that, although visible in the Azure fabric, VHD page blobs
and VHD snapshots do not appear in blob listings in Development Storage because they are not
stored as blobs. Consequently, a VHD file uploaded to Development Storage cannot be mounted as
an Azure Drive.

The workaround is to attach the VHD to an empty folder in the well-known directory from where it
can be mounted exactly as it would be in the cloud. The Disk Management component of the
Windows Server Manager is used to attach a VHD to an empty folder (e.g. drivename) in a
subdirectory (e.g. drivecontainername) of the well known directory so that the VHD can be mounted
precisely as it would be in the cloud.

The Azure Drive API can then be used to mount the Azure Drive as if it were backed by a VHD page
blob named, drivename, located in a container named drivecontainername. There is no need to
invoke the Create() method. There is no entry in Development Storage for this blob. Note that subst
can be invoked in a command window to view the list of currently mounted Azure Drives.

Azure Drives are mounted and unmounted in the Development Environment just as they are in the
cloud. However, DriveMountOptions.Force is not implemented in the Development Environment.

It is important to remember that Azure Drives are available only inside the Azure Fabric – cloud or
development – and that they are not mountable in an ordinary Windows application.

54


AZURE TABLE SERVICE AS A NOSQL DATABASE

By Mark Rendle

The Windows Azure SDK is one of the things which sets the Azure platform above other “cloud”
platforms. The Table Service SDK, in particular, wraps the massively scalable storage service in an API
which is instantly familiar to anyone who has used LINQ-to-SQL or the Entity Framework. CLR
property names are used as column names, class names are (by default) used as table names. But
this simplicity enforces the concept of schema over a data store which is innately schema-less.

When you create an Azure Table, you do not specify columns. The table itself is not structured in
that way. The column names are part of the entities (rows) which are stored in the table, and they
can be different for different entities within a single table. This fact opens up a world of interesting
possibilities when it comes to planning and designing your persistence layer.

MASTER-DETAIL STRUCTURES

The Table Service does not support relational features, such as primary/foreign keys, joins in queries,
or transactions to coordinate modifications across multiple tables. But Table Service entities with the
same partition key are held together in the store, can be retrieved very quickly with a single query,
and can be modified together inside Entity Group Transactions 15.

Because rows can have different structures, you can actually store the data from two (or more)
different types of object within the same partition key in the same table. Let’s say, for example, you
are storing Invoices, with an arbitrary number of line items for each invoice. Using the invoice
number for the Partition Key, an empty string value for the Row Key of the Invoice entity, and then
sequential numbering for the Row Keys of the LineItem entity, this “master-detail” data can be
created in a single transaction, and retrieved incredibly quickly in a single query.

Figure 1: Invoice and Line Item entities stored within a single Azure Table

DYNAMIC SCHEMA

It’s very common these days for database applications to allow the end user to extend the out-of-
the-box data model with their own fields. In a relational database, this is commonly achieved with a
complicated system of metadata tables, and performance when querying against these custom fields
is accordingly horrible.

15
http://msdn.microsoft.com/en-us/library/dd894038.aspx

55


In Azure, these fields can be added to each entity just by specifying the extra column names and
values in the Insert operation. And then subsequently, querying against these columns can be done
in exactly the same way as against the columns that were part of the original application.

Be aware, though, that this approach requires you to get down and dirty with the REST API, where
you have complete control over column names at the per-entity level. Also be aware that there is a
hard limit of 255 properties per entity, including the Partition Key, Row Key and Timestamp system
columns.

COLUMN NAMES AS DATA

There are times when you want to store several thousand rows of related data; things like activity
logs, or relationships between users in a social-networking database. Azure Tables can handle this
volume of data very easily, but because a query operation can only return 1,000 rows per result set,
retrieving them all could take several round-trips to the server, increasing the time of the operation
and the cost of the transactions.

If the data can be stored as a single string or binary blob, though, you can group 250 “rows” together
in a single entity, using the column name as a makeshift sub-key. This is possible because there is
absolutely no limit to the number of different column names that can be used within a table.

The best way to achieve this is to use an empty Row Key to identify the “active” entity; that is, the
one with spare columns to add data into. In addition, have an extra column, named “UniqueId”, with
a timestamp or Guid value. By running MERGE updates against this entity, you can add new “rows”;
when the MERGE operation fails with a “too many values” error, you simply create a copy of that
entity (Row Keys cannot be updated) with the UniqueId value as the Row Key, and reset the active
entity to clear all the values and set a new UniqueId: this prevents two simultaneous operations
from creating duplicate copies of the entity.

Figure 2: Using column names as data to reduce number of rows used for high-volume tables

TABLE NAMES AS DATA

Another thing which is not limited is the number of tables you can create within an account or
project. And because you don’t have to specify the schema of each table, creating hundreds of them
is not prohibitive.

One obvious use for this is where you need short-lived, high-volume tables, perhaps to contain
analytics data which gets archived and cleared down after a couple of weeks (to cut down on storage
costs). Running hundreds of DELETE operations against a single table comes with scalability issues
and incurs a high transaction cost. But if you create multiple tables with the date as part of the
name, clearing down a day’s data is just a matter of dropping the table; one quick operation.

56


Figure 3: Using table names as part of data schema

SUMMARY

As you can see, the scope for creative schema design in Azure Table storage is massive. This is one of
the best things about the NoSQL family of databases: many of the problems with which we have
traditionally struggled when using rigidly-structured relational databases have much simpler, more
direct solutions in a less-structured paradigm.

Whilst the official Microsoft Azure SDK is a great tool for modelling a lot of domains, and provides a
very usable interface to the powerful features of the Azure storage stack with its familiar LINQ
DataContexts and Query providers, I hope this short article has highlighted a few of the things you
can achieve by digging deeper into the SDK, or ignoring it entirely and learning to use the REST API to
fully exploit the NoSQL nature of Azure Table Storage.

57


QUERIES AND AZURE TABLES

By Neil Mackenzie

CREATEQUERY<T>()

There are several classes involved in querying Azure Tables using the Azure Storage Client library.
However, a single method is central to the querying process and that is CreateQuery<T>() in the
DataServiceContext class. CreateQuery<T>() is declared:

public DataServiceQuery<T> CreateQuery<T>(String entitySetName);

This method is used implicitly or explicitly in every query against Azure Tables using the Storage
Client library. The CreateQuery<T>() return type is DataServiceQuery which implements both the
IQueryable<T> and IEnumerable<T> interfaces:

LINQ supports the decoration of a query by operators filtering the results of the query. Although a
full LINQ implementation has many decoration operators only the following are implemented for the
Storage Client library:

 Where
 Take
 First
 FirstOrDefault

These are implemented as extension methods on the DataServiceQuery<T> class. When a query is
executed these decoration operators are translated into the $filter and $top operators used in the
Azure Table Service REST API query string submitted to the Azure Table Service.

The following example demonstrates a trivial use of CreateQuery<T>() and the Take() operator to
retrieve ten records from a Songs table:

protected void SimpleQuery(CloudTableClient cloudTableClient)
{
TableServiceContext tableServiceContext =
cloudTableClient.GetDataServiceContext();

tableServiceContext.ResolveType = (unused) => typeof(Song);

IQueryable<Song> songs =

(from entity in tableServiceContext.CreateQuery<Song>("Songs")
select entity).Take(10);

List<Song> songsList = songs.ToList<Song>();
}

As with other LINQ implementations the query is not submitted to the Azure Table Service until the
query results are enumerated. Note the use of ResolveType to work around a performance issue
when the table name differs from the class name.

58


The MSDN Azure documentation has a page showing several examples of LINQ queries
demonstrating filtering on properties with the various datatypes – String, numbers, Boolean and
DateTime – so they will not be repeated here. Instead, this article focuses on the various methods
provided to invoke queries.

CONTEXTS

The SimpleQuery example used the TableServiceContext.CreateQuery() method as follows:

tableServiceContext.CreateQuery<Song>("Songs")

This syntax can be simplified by deriving a class from TableServiceContext as follows:

public class SongContext : TableServiceContext
{
internal static String TableName = "Songs";

public SongContext(String baseAddress, StorageCredentials credentials)
: base(baseAddress, credentials)
{ }

public IQueryable<Song> Songs
{
get
{
return this.CreateQuery<Song>(TableName);
}
}

public void AddSong(Song song)
{
this.AddObject(TableName, song);
this.SaveChanges();
}
}

This class is specific to the Song model class representing the entities in the Azure table named
Songs. The Songs property can be used as the core of any LINQ query instead of the
tableServiceContext.CreateQuery<Song>(“Songs”) used previously. Doing this simplifies and
improves the readability of the LINQ query. For example, the LINQ query:

from entity in tableServiceContext.CreateQuery<Song>("Songs") select entity

can be rewritten as:

from entity in songContext.Songs select entity

where songContext is a SongContext object.

QUERYING ON PARTITIONKEY AND ROWKEY

The primary key for an entity in an Azure table comprises PartitionKey and RowKey. The most
performant query in the Azure Table Service is one specifying both PartitionKey and RowKey
returning a single entity. When handling a query specifying PartitionKey and not RowKey the Azure

59


Table Service scans every entity in the partition while for a query specifying RowKey and not
PartitionKey it must query each partition separately.

CONTINUATION

A query specifying both PartitionKey and RowKey is the only query guaranteed to return its entire
result set in a single response. A further limit on query results is that no more than 1,000 results are
ever returned in response to a single request – regardless of how many entities satisfy the query
filter. The Azure Table Service inserts a continuation token in the response header to indicate there
are additional results which can be retrieved through an additional request parameterized by the
continuation token.

DATASERVICEQUERY

DataServiceQuery is the WCF Data Services class representing a query to the Azure Table Service.
DataServiceQuery provides the following methods to send queries to the Azure Table Service.

public IAsyncResult BeginExecute(AsyncCallback callback, Object state);
public IEnumerable<TElement> EndExecute(IAsyncResult asyncResult);
public IEnumerable<TElement> Execute();

Execute() is a synchronous method which sends the query to the Azure Table Service and blocks until
the query returns. BeginExecute() and EndExecute() are a matched pair of methods used to
implement the AsyncCallback Delegate model for asynchronously accessing the Azure Table Service.

The following is an example of Execute():

protected void UsingDataServiceQueryExecute(CloudTableClient cloudTableClient)
{
DataServiceQuery<Song> dataServiceQuery =
select entity).Take(10) as DataServiceQuery<Song>;
IEnumerable<Song> songs = dataServiceQuery.Execute();
foreach (Song song in songs)
{
String singer= song.Singer;
}
}

Note that the query must be explicitly cast from an IQueryable<Song> to a DataServiceQuery<Song>.

The asynchronous model is implemented by invoking BeginExecute() passing it the name of a static
callback delegate and, optionally, an object providing some invocation context to the callback
delegate. In practice, this object must include the DataServiceQuery object on which BeginExecute()
was invoked. BeginExecute() initiates query submission and sets up an IO Completion Port to wait for
the query to complete. When it completes, the callback delegate is invoked on a worker thread.
EndExecute() must be invoked in the callback delegate to access the results. Furthermore, a failure
to invoke EndExecute() could lead to resource leakage. EndExecute() returns an object of type

60


QueryOperationResponse<T> which implements an IEnumerable<T> interface.
QueryOperationResponse<T> exposes information about the query request and response including
the HTTP status of the response.

Note that the version of WCF Data Services currently used in Azure does not support server-side
paging so that a DataServiceQuery is not able to process continuation tokens. The next version does
but is not yet released in the Azure environment. Consequently, DataServiceQuery.Execute() may
not retrieve all the entities requested if there are more than 1,000 of them – or, indeed, if there is a
need for continuation tokens which can happen on any query not specifying both PartitionKey and
RowKey.

CLOUDTABLEQUERY

The CloudTableQuery<T> class supports continuation tokens. A CloudTableQuery<T> object is
created using one of the two constructors:

public CloudTableQuery<TElement>(DataServiceQuery<TElement> query,
RetryPolicy policy);
public CloudTableQuery<TElement>(DataServiceQuery<TElement> query);

or the AsTableServiceQuery() extension method of the TableServiceExtensionMethods class:

public static CloudTableQuery<TElement> AsTableServiceQuery<TElement> (
IQueryable<TElement> query )

The CloudTableQuery<T> class has the following synchronous methods to handle query submission
to the Azure Table Service:

public IEnumerable<TElement> Execute(ResultContinuation continuationToken);
public IEnumerable<TElement> Execute();

Execute() handles continuation automatically and continues to submit queries to the Azure Table
Service until all the results have been returned. Execute(ResultContinuation) starts the request with
a previously acquired ResultContinuation object encapsulating a continuation token and continues
the query until all results have been retrieved. Note that care should be taken when using either
form of Execute() since large amounts of data might be returned when the query is enumerated.

61


The following example shows Execute() retrieving all the records from a table:

protected void UsingCloudTableQueryExecute(CloudTableClient cloudTableClient)
{
CloudTableQuery<Song> cloudTableQuery =
select entity).AsTableServiceQuery<Song>();
IEnumerable<Song> songs = cloudTableQuery.Execute();
foreach (Song song in songs)
{
String singer = song.Singer;
}
}

The CloudTableQuery<T> class has an equivalent set of asynchronous methods declared:

public IAsyncResult BeginExecuteSegmented(ResultContinuation continuationToken,
AsyncCallback callback, Object state);
public IAsyncResult BeginExecuteSegmented(AsyncCallback callback, Object state);
public ResultSegment<TElement> EndExecuteSegmented(IAsyncResult asyncResult);

These follow the method-naming style used elsewhere in the Storage Client library whereby the
suffix Segmented indicates that the methods bring data back in batches – in this case from one
continuation token to the next. This provides a convenient method of paging through results in
batches of size specified by the Take() query decoration operator or the 1,000 records that is the
maximum number of records retrievable in a single request. As with the synchronous Execute()
methods the difference between the two BeginExecuteSegmented() methods is that one starts the
retrieval at the beginning of the query result set while the other starts at the entity indicated by the
continuation token in the ResultContinuation parameter.

The following is an example of BeginExecuteSegmented() and EndExecuteSegmented() paging
through the result set of a query in pages of 10 entities at a time:

protected void QuerySongsExecuteSegmentedAsync(
CloudTableClient cloudTableClient)
{

(from entity in tableServiceContext.CreateQuery<Song>("Songs").Take(10)
select entity ).AsTableServiceQuery<Song>();
IAsyncResult iAsyncResult = cloudTableQuery.BeginExecuteSegmented(
BeginExecuteSegmentedIsDone, cloudTableQuery);
}

62


static void BeginExecuteSegmentedIsDone(IAsyncResult result)
{
CloudTableQuery<Song> cloudTableQuery = result.AsyncState as
CloudTableQuery<Song>;
ResultSegment<Song> resultSegment =
cloudTableQuery.EndExecuteSegmented(result);

List<Song> listSongs = resultSegment.Results.ToList<Song>();

if (resultSegment.HasMoreResults)
{
IAsyncResult iAsyncResult = cloudTableQuery.BeginExecuteSegmented(
resultSegment.ContinuationToken, BeginExecuteSegmentedIsDone,
cloudTableQuery);
}
}

It is also possible to iterate through subsequent results using the GetNext() method of the
ResultSegment<T> class rather than using BeginExecuteSegmented() with a ResultContinuation
parameter.

It is worth noting the difference made by replacing the cloudTableQuery in the above example with:

select entity).Take(10).AsTableServiceQuery<Song>();

Here, the Take(10) is outside the LINQ query definition. This query results in the retrieval of only 10
records and does not page through the table in pages of 10 entities as in the previous example.

Note that exception handling is even more important in callback delegates than it is in normal code
because they are not invoked from user code and errors cannot be caught outside the method.
Consequently, all errors must be caught and handled inside the callback delegate.

63


TRICKS FOR STORING TIME AND DATE FIELDS IN TABLE STORAGE

By Saksham Gautam

Windows Azure Table Storage supports storing enormous amount of data in massively scalable
tables in the cloud. The tables can store terabytes upon terabytes of data and billons of entities. In
order to attain this amount of scalability, Windows Azure Table storage employs a scale-out model
to distribute entities across multiple storage nodes. Each application has to decide on the partition
scheme by choosing the partition-keys for the entities. Moreover, each entity within a partition is
uniquely identified by its row-key. In this section, we discuss how to use the two types of entity keys
in order to simulate a descending order based on timestamps so that queries based on dates are
more efficient.

Entity keys, PartitionKey and RowKey, are strings of up to 1KB in size. As they are strings, all
comparisons are purely lexicographic, i.e. “100” < “20” < “9”. At first glance making a key from time
might seem very straightforward. “Just use ‘yyyyMMddHHmmssfffff’ pattern for the DateTime“, you
might say. Using fixed length for different components of the time would indeed ensure that lexical
comparisons are equivalent to DateTime comparisons. However, the entities would be arranged in
an ascending order within the table. As many real life applications are interested in fetching the
most recent entities first, the queries are inadvertently less efficient using this simple method. Let’s
examine this more closely using an example.

Let us assume that we are making a location based service application that lets mobile users send
periodic position reports to a Windows Azure Worker Role, which in turn logs the reports to the
table storage. A ‘PositionReport’ entity could look something like that shown in Table 1.

Property DataType
PartitionKey String
RowKey String
DeviceId String
ReportedOn DateTime
Latitude Double
Longitude Double
Table 1 Properties of a PositionReport entity

Also, suppose that the majority of queries would be something like, “Get 100 most recent position
reports of device X” so that they could be displayed on a map. If we used the ascending order model,
our query would first have to fetch all the entities from the table (or partition), then get the last 100
entities. There is a way to fetch just the 100 entities that you need.

The clue here lies on the ReportedOn property. Let’s convert the time the device reported into
‘reverse timestamp’ by simply doing the following:

(DateTime.MaxValue.Ticks - reportedOn.Ticks).ToString().PadLeft(19)

64


By reversing the number of ‘ticks’ in the time and then making it of fixed length, we create a
mechanism for assigning newer entities with keys that are lexically less than those of older entities.
We could use this as the RowKey for our entity. Then, we would not need to have an additional
property for storing the time the device sent the position report because we could easily compute it
using the RowKey as shown below. As a result, we save some bandwidth as well.

new DateTime(DateTime.MaxValue.Ticks - Int64.Parse(RowKey))

The prime candidate for the partition-key would be the ID of the device, so that all entities for a
single device go into one partition. However, if a device sends many position reports over time, our
partition might grow enormous. Choosing a partition key is an opportunity to load balance the
entities across different servers. Hence, we can definitely do better than choosing a fixed partition
key. We could use a similar technique to the one we used for our RowKey. Without the loss of
generality let us assume that we could keep all position reports for a device within a month in one
partition. With that, constructing the reverse timestamp for our partition key is easy. We could do
the following.

DateTime temp = new DateTime(reportedOn.Year, reportedOn.Month, 1);

We then create an identifier by concatenating the ID of the device with the reversed timestamp
based on the time the device reported. But first we have to decide whether the device ID is
significant in queries that we want to perform, or whether the ReportedOn property is more
significant. In other words, are most of your queries something like “Give me position reports for
device X” or are they more like “Give me devices that have reported within a certain interval”. Based
on that, we determine whether our partition key would have timestamp or the device ID as the
prefix of the partition key. Let us assume that we decided to use the device ID as prefix. Once we
have our partition key, we could easily recalculate the device ID. Note that creating entity key by
concatenation in this way only works if the device id is of fixed length.

String deviceId = PartitionKey.Substring(0, PartitionKey.Length - 19);

The PositionReport entity now looks like the one shown in Table 2.

65


Property DataType
PartitionKey String
RowKey String
Latitude Double
Longitude Double
GetReportedOn() Returns DateTime
GetDeviceId() Returns String
Table 2 Modified PositionReport entity

As for the queries based on time, we can construct them such that we include at least one of the
entity keys and preferably (always) the partition key, as illustrated in the following examples.

1. 100 most recent entities for the device within this month
a. Compute the combined partition key based on the device id and the reversed
timestamp using the first day of the current month
b. Query the table using greater than (>) operator on the row key and equal to (=)
operator on the values computed in 1.a.
2. 100 most recent entities for the device
a. Note that all partition keys for entities belonging to a particular device are created
by appending a suffix to the device ID. Hence, query the table storage using greater
than (>) operator on the partition key.
b. Since a device may not have 100 position reports, the entities returned by the query
may contain entities corresponding to other devices. You should remove them in the
data access layer before you use the result in the application.
c. Note that we did not use a (>) and (<) operators to filter out results at the table
storage itself, but instead chose to filter the results in our code. This is because as of
time of this writing, if a range query is based on partition keys, i.e. it contains ‘AND’
or ‘OR’ keyword, it results in a full table scan.
3. 100 most recent entities for the device within a specific period.
a. Construct the combined partition keys for the dates that define the interval as in 1.a.
b. Construct the row keys by using reverse timestamps for the dates.
c. As mentioned in 2.c, it is not efficient to use all the keys in a single query. Hence,
create two queries, each using one partition key and one row key.
d. Perform the two queries and combine (union) the entities before using them in the
application.

66


Using reverse ticks for entity keys should be sufficient in most of the cases. However, there might be
scenarios when there is a lot of data generated, and when you compute the entity keys using the
method described above more than one entity might try to use the same keys. Take an example of
an application in which there are multiple processes that add ‘Event’ entities to the ‘Events’ table.
An ‘Event’ entity could look something like that shown in Table 3.

Property DataType
PartitionKey String
RowKey String
EventType Int
Description String
GetEventSource() Returns String
GetEventTime() Returns DateTime
Table 3 Structure of Event Entity

Partition key is based on the Event source which could be the name of the process that generated
the event and Row key is the based on event time. If there are multiple processes with the same
name that write into the table at the exact same time, we would run into problems because both
partition key and row keys have to be unique. The solution to avoid such entity key collisions is to
append a globally unique identifier (GUID) to the end of the row key. Hence, the row keys would be
computed like so.

String revTicks = (DateTime.MaxValue.Ticks – eventTime.Ticks).PadLeft(19);
RowKey = revTicks + Guid.NewGuid().ToString();

Care has to be taken while querying. Since the row keys don’t correspond directly to ticks anymore,
it is not correct to use <= and >= operators in the queries when used with row keys. To get all events
that occurred on time = T, one has to convert T and (T + 1 tick) into row keys and use them in the
where condition in your query.

One might ask, can this technique be used on a Table that is already using only reverse timestamps
as row keys. The short answer is yes! As we discussed earlier, comparisons on entity keys are purely
lexicographical. If there are three strings A, B and C, and if lexically A < B < C, adding any suffix to
either one or all of them does not affect how they are ordered afterwards.

67


USING WORKER ROLES TO IMPLEMENT A DISTRIBUTED CACHE

By Josh Tucholski

One of the most sought after goals of an aspiring application, viral growth, is also one of the quickest
routes to failure if the application receives it unexpectedly. Windows Azure addresses the problem
of viral growth by supporting a scalable infrastructure to scale and quickly allocate additional
instances of a service on an as-needed basis. However as traffic and use of an application grows, it is
inevitable that its database will suffer without any type of caching layer in place.

In smaller environments, it is sufficient to use the built-in cache that a server provides for efficient
data retrieval. This is not the case in Windows Azure. Windows Azure provides a transparent load
balancer, thereby making the placement of data into specific server caches impractical unless one
can guarantee each user continually communicates with the same web server. Armed with a
distributed cache and a well-built data access tier, one can address this issue to ensure that all
clients that issue similar data requests only retrieve to the database once and use the cached version
going forward (pending updates).

CONFIGURING THE CACHE

One of the most popular distributed caching implementations is memcached , used by YouTube,
Facebook, Twitter, and Wikipedia. Memcached can run from the command line as an executable,
making it a great use case for a Windows Azure worker role. When memcached is active all of its
data is stored in memory which makes increasing the size of the cache as easy as increasing the
worker instance count.

The following code snippet demonstrates how a worker role initializes the memcached process and
defines required parameters identifying its unique instance IP address and the maximum size of the
cache in MB. Note: Your will need to include the memcached executable to start the process within
the Windows Azure app fabric.

//Retrieve the Endpoint information for the current role instance
IPEndPoint endpoint =
RoleEnvironment.CurrentRoleInstance.InstanceEndpoints["EndpointName"].IPEndpoint;
string cacheSize = RoleEnvironment.GetConfigurationSettingValue(CacheSizeKey);

//memcached arguments
//m = size of the cache in MB
//l = IP address of the cache server
//p = port address of the cache server
string arguments = "-m " + cacheSize + " -l " + endpoint.Address + " -p " + endpoint.Port;

ProcessStartInfo startInfo = new ProcessStartInfo()
{
CreateNoWindow = true,
UseShellExecute = false,
FileName = "memcached.exe",
Arguments = arguments
};

//The worker role’s only purpose is to execute the memcached process and run until shutdown
using (Process exeProcess = Process.Start(startInfo))
{
exeProcess.WaitForExit();
}

68


USING THE DISTRIBUTED CACHE

Once the distributed cache instance is active, a client library, such as Enyim, is used to access the
contents of the cache. The most challenging part of integrating Enyim with the distributed cache is
identifying all of the cache endpoints available for access. Fortunately when using the Windows
Azure API, internal endpoints are discoverable by worker role name. Hooks can be added to
determine if at any point in time the client is out of sync with the actual number of worker role
instances, causing an automatic refresh. The Web Role found in the Windows Azure Memcached
Solution Accelerator, has a well written implementation of the Enyim client demonstrating how to
detect its configuration. The following code snippet shows how simple it is to retrieve and store data
in the cache once the configuration interface is implemented:

//See the Windows Azure Memcached Solution Accelerator for instructions on implementing
//the AzureMemcachedClientConfiguration class
private static AzureMemcachedClientConfiguration _configuration;
//MemcachedClient is provided through Enyim
private static MemcachedClient _client;

private static MemcachedClient Client
{
get
{
EnsureClientUpToDate();
return _client;
}
}

private static void EnsureClientUpToDate()
{
//If a configuration exists, confirm that the endpoints it is
//aware match the ones in Windows Azure
if (_client == null || _configuration == null || _configuration.IsOutOfDate)
{
_configuration = new AzureMemcachedClientConfiguration();
_client = new MemcachedClient(_configuration);
}
}

public object Get(string key)
{
//The client serves four key purposes: retrieval, storage, removing, and flushing the cache
object val = Client.Get(key);

return val;
}

public void Put(string key, object value)
{
//Stores the key/value pair in the distributed cache using the client
//Available StoreMode operations
//Add – adds the item to the cache only if it does not exist
//Replace – replaces an item in the cache only if it does exist
//Set – will add the item if it does not exist or replace it if it does
if (!Client.Store(StoreMode.Set, key, value, DateTime.UtcNow.AddSeconds((double)_expiry)))
{
Console.WriteLine("MemcachedCache - could not save key " + key);
}
}

Once the implementation of the client is in place, any part of the application that has access to the
client can integrate with the distributed cache. Certain object-relational mapping tools, such as
nHibernate, even contain support for cache providers. From this point, it is simple to construct a
new cache provider and integrate with the Windows Azure distributed cache. I recommend hashing

69


your object keys if any other library integrates with your distributed caching to avoid any name
collisions.

Implementing a distributed cache in most scenarios has proven beneficial as long as the application
controls the data flowing in and out. If external resources are modifying the data by communicating
to the database directly, you need to rethink the architecture of your distributed application or at
least invalidate your cache more frequently to ensure that data within it is not stale. With Windows
Azure, your distributed cache will consistently have high availability and never receive interruptions
when adding additional instances or recovering from system failures.

70


LOGGING, DIAGNOSTICS AND HEALTH MONITORING OF WINDOWS AZURE
APPLICATIONS

By David Gristwood

Monitoring the health of an application is key to being able to keep it up and running and to help
resolve problems as and when they arise. Most developers know how to do this to some degree with
on-premise application, as part of the general maintenance of the application and server. However,
an Azure application, running up in the cloud, is very different to a traditional application when it
comes to monitoring and performing diagnostics, for many reasons. Firstly, the application will
typically be running across a whole set of machines, managed by the Windows Azure fabric, and is
dynamic and will change over time, so the problem is much more complex than that of a single
machine. Secondly, there is no direct, system admin access to the machines running in Windows
Azure, in the way you would have if you owned and managed the machines yourself – the Windows
Azure fabric handles much of the complexities of deploying and managing roles and machines, and it
doesn’t provide low level access to resources. And, finally, you can’t just attach a debugger to the
cloud and step through your code.

Fortunately Windows Azure has a diagnostic capability that allows you to monitor the health of your
application across the different roles that make up your Azure application. It’s not about creating
new APIs – but rather it’s about using the existing logging and tracing capabilities in the Windows
platform that many developers are already familiar with, and building a monitoring strategy based
on them, to cover scenarios such as debugging, troubleshooting, performance, resource usage
monitoring, traffic analysis, capacity planning, and auditing.

There are three key stages to using the Windows Azure diagnostics. Firstly, deciding what diagnostic
data you wish to collect. Secondly, deciding when and what diagnostic data should be persisted out
to Windows Azure for analysis. And finally, downloading the data from Windows Azure for analysis.

COLLECTING DIAGNOSTIC DATA

One of the most common pieces of diagnostic data to collect is the Windows Azure logs, which is
where any System.Diagnostics.Trace messages embedded in an application are output to. These
Trace messages are the main way to log the flow and status of an application and are built on the
existing Event Tracing for Windows (ETW) capabilities. By default this trace data, the IIS 7.0 logs and
the Windows Diagnostic infrastructure logs are all collected when you switch on the diagnostics. The
Windows Diagnostic infrastructure logs help provide general purpose problem detection,
troubleshooting, and resolution for Windows components.

Diagnostics are initialized within a role’s OnStart() method:

public override bool OnStart()

{

var config = DiagnosticMonitor.GetDefaultInitialConfiguration(); // Get default initial
configuration

71


// add any other data sources here that need to be tracked are added here


Additional data sources can be added to the DiagnosticsMonitor before the Start() method is called.
For diagnostics, the Crash dumps and Windows Event logs can prove invaluable. For fine tuning and
capacity planning the Performance counters (which include CPU, memory, paging, etc.) are essential.

PERSISTING DIAGNOSTIC DATA

All the diagnostic data collected is stored in the local file store of the virtual machine within the
Windows Azure fabric. The local file store will not survive machine recycles or rebuilds and therefore
the diagnostic data needs to be transferred to a persistent store, such as Windows Azure storage.
These transfers can take place either as regular scheduled events, perhaps every 10 minutes, or on
demand.

Setting up a scheduled transfer is as easy as setting up the ScheduledTransferPeriod property on the
appropriate data source before the call to DiagnosticMonitor.Start():

// schedule transfer of basic logs to Azure storage

diagConfig.Logs.ScheduledTransferPeriod = System.TimeSpan.FromMinutes(1.0);


An on demand transfer can be initiated from outside a Windows Azure application, which makes it
possible to control persisting data from a dashboard or system support application.

ANALYSING THE DIAGNOSTIC DATA

The default behaviour of a transfer of diagnostic data is to persist the data to a set of “wad”
(Windows Azure Diagnostics) prefixed Windows Azure Tables and Blobs containers (Crash dumps go
into Blob storage, Windows Azure logs into Tables, etc.). These can then be inspected on line, with
tools such as Cerebrata’s Azure Diagnostic Manager (see screenshot), or downloaded using the
REST-based API for local viewing and analysis.

72


For analysis, turning, or resolving more complex issues, storing log and trace information in SQL
Server will make it easier to filter the relevant information and detect process flow, exceptions, etc.
As with all debugging and monitoring scenarios, the key is to ensure good quality information is
embedded within applications, especially to help track flow across multiple machine and roles.

MORE INFORMATION

You can view Matthew Kerner’s excellent PDC09 session http://microsoftpdc.com/Sessions/SVC15
and the demos from the talk can be downloaded from
http://code.msdn.microsoft.com/WADiagnostics . The MSDN documentation can be found at
http://msdn.microsoft.com/en-us/library/ee758705.aspx

73


SERVICE RUNTIME IN WINDOWS AZURE

By Neil Mackenzie

ROLES AND INSTANCES

Windows Azure implements a Platform as a Service model through the concept of roles. There are
two types of role: a web role deployed with IIS; and a worker role similar to a windows service.
Azure implements horizontal scaling of a service through the deployment of multiple instances of
roles. Each instance of a role is allocated exclusive use of a VM selected from one of several sizes -
from a small instance with 1 core to an extra-large instance with 8 cores. Memory and local disk
space also increase with instance size.

All inbound network traffic to a role passes through a stateless load balancer which uses an
unspecified algorithm to distribute inbound calls to the role among instances of the role. Individual
instances do not have public IP addresses and are not directly addressable from the Internet.
Instances can connect directly to other instances in the service using TCP and HTTP.

Azure provides two deployment slots: staging for testing in a live environment; and production for
the production service. A role with a public endpoint has a permanent URL in the production slot
and a temporary URL in the staging slot. Otherwise, there is no real difference between the two
slots.

ENDPOINTS

An Azure role has two types of endpoint: a public-facing input endpoint; and a private internal
endpoint for communication among instances. Input endpoints and internal endpoints are
associated with an Azure role through specification in the Service Definition file.

A web role may have only one HTTP input endpoint and one HTTPS input endpoint. A worker role
may have an unlimited number of HTTP, HTTPS and TCP input endpoints as long as each is associated
with a different port number. External services make connection requests to the Virtual IP address
for the role and the input endpoint port specified for the role in the Service Definition file. These
connection requests are load balanced and forwarded to an Azure-allocated port on one of the
instances of the role.

A web role may have only one HTTP internal endpoint. A worker role may have an unlimited number
of HTTP and TCP internal endpoints, the only limitation being that each internal endpoint must have
a unique name.

SERVICE UPGRADES

There are two ways to upgrade an Azure service: in-place upgrade and Virtual IP (VIP) swap. An in-
place upgrade replaces the contents of a deployment slot with a new Azure application package and
configuration file. A VIP swap simply swaps the virtual IP address associated with the production and
staging slots. Note that it is not possible to do an in-place upgrade where the new application
package has a modified Service Definition file. Instead, any existing service in one of the slots must

74


be deleted before the new version is uploaded. A VIP swap does support modifications to the Service
Definition file.

The Azure SLA comes into force only when a service uses at least two instances per role. Azure uses
upgrade domains and fault domains to facilitate adherence to the SLA.

The Azure fabric deploys instances over several upgrade domains. The Azure fabric implements an
in-place upgrade of a role by bringing down all the instances in a single upgrade domain, upgrading
them, and then restarting them before moving on to the next upgrade domain. The number of
upgrade domains is configurable through the upgradeDomainCount attribute (default 5) to the
ServiceDefinition root element in the Service Definition file. The Azure fabric completely controls the
allocation of instances to upgrade domains though an Azure service can view the upgrade domain
for each of its instances through the RoleInstance.UpdateDomain property.

When Azure instances are deployed, the Azure fabric spreads them among different fault domains
which means they are deployed so that a single hardware failure does not bring down all the
instances. The Azure fabric completely controls the allocation of instances to fault domains though
an Azure service can view the fault domain for each of its instances through the
RoleInstance.FaultDomain property.

SERVICE DEFINITION AND SERVICE CONFIGURA TION

An Azure service is defined and configured through its Service Definition and Service Configuration
files.

The Service Definition file specifies the roles contained in the service along with the following for
each role:

 upgradeDomainCount - number of upgrade domains for the service
 vmsize - the instance size from Small through ExtraLarge
 ConfigurationSettings - defines the settings used to configure the service
 LocalStorage- specifies the amount and name of disk space on the local VM
 InputEndpoints - defines the external endpoints for a role
 InternalEndpoint - defines the internal endpoints for a role
 Certificates - specifies the name and location of the X.509 certificate store

The Service Configuration file provides the configured values for:

 osVersion - specifies the Azure guest OS version for the deployed service
 Instances - specifies the number of instances of a role
 ConfigurationSettings - specifies the role-specific configuration parameters
 Certificates - specifies X.509 certificates for the role

The Service Configuration file comprises one of the two distinct parts of the service application
package and consequently can be modified through an in-place upgrade. It can also be modified
directly on the Azure portal.

ROLEENTRYPOINT

75


RoleEntryPoint is the base class providing the Azure fabric an entry point to a role. All worker roles
must contain a class derived from RoleEntryPoint but web roles can use ASP.Net lifecycle
management instead. The standard Visual Studio worker role template provides a starter
implementation of the necessary derived class. RoleEntryPoint is declared:

public abstract class RoleEntryPoint {
protected RoleEntryPoint();

public virtual Boolean OnStart();
public virtual void OnStop();
public virtual void Run();
}

The Azure fabric initializes the role by invoking the overridden OnStart() method. Prior to this call the
status of the role is Busy. Note that a web role can put initialization code in Application_Start instead
of OnStart(). The overridden Run() is invoked following successful completion of OnStart() and
provides the primary working thread for the role. An instance recycles automatically when Run()
exits so care should be taken, through use of Thread.Sleep() for example, that the Run() method
does not terminate. Azure invokes the overridden OnStop() during a normal suspension of the role.
The Azure fabric stops the role automatically if OnStop() does not return within 30 seconds. Note
that a web role can put shutdown code in Application_End instead of OnStop().

ROLE

The Role class represents a role in an Azure service. It exposes the Name of the role and a collection
of deployed Instances for it.

ROLEENVIRONMENT

The RoleEnvironment class provides functionality allowing an instance to interact with the Azure
fabric as well as functionality providing access to the Service Configuration file and limited access to
the Service Definition file.

76


RoleEnvironment is declared:

public sealed class RoleEnvironment {
public static event EventHandler<RoleEnvironmentChangedEventArgs> Changed;
public static event EventHandler<RoleEnvironmentChangingEventArgs> Changing;
public static event EventHandler<RoleInstanceStatusCheckEventArgs>
StatusCheck;
public static event EventHandler<RoleEnvironmentStoppingEventArgs> Stopping;

public static RoleInstance CurrentRoleInstance { get; }
public static String DeploymentId { get; }
public static Boolean IsAvailable { get; }
public static IDictionary<String,Role> Roles { get; }

public static String GetConfigurationSettingValue(
String configurationSettingName);
public static LocalResource GetLocalResource(String localResourceName);
public static void RequestRecycle();
}

The IsAvailable property specifies whether or not the Azure environment is available. DeploymentId
identifies the current deployment, Roles specifies the roles contained in the current service, and
CurrentRoleInstance is a RoleInstance object representing the current instance. Note that Roles
reports all Instances as being of zero size except the current instance and any instance with an
internal endpoint.

GetConfigurationSettingValue() retrieves a configuration setting for the current role from the Service
Configuration file. GetLocalResource() returns a LocalResource object specifying the root path for
any local storage for the current role defined in the Service Definition file. RequestRecycle() initiates
a recycle, i.e., stop and start, of the current instance.

The RoleEnvironment class also provides four events to which a role can register a callback method
to be notified about various changes to the Azure environment. A role typically registers callback
methods with these events in its OnStart() method.

The StatusCheck event is raised every 15 seconds. An instance can use the SetBusy() method of the
RoleInstanceStatusCheckEventArgs class to indicate it is busy and should be taken out of the load-
balancer rotation. The Stopping event is raised when an instance is undergoing a controlled
shutdown although there is no guarantee it will be raised when an instance is shutting down due to
an unhandled error. Note that the Stopping event is raised before the overridden OnStop() method
is invoked.

The Changing event is raised before and the Changed event after a configuration change is applied
to the role. The callback method for the Changing event has access to the old value of the
configuration setting and can be used to control whether or not the instance should be restarted in
response to the configuration change. The callback method for the Changed event has access to the
new value of the configuration setting and can be used to reconfigure the instance in response to
the change. The Changing and Changed callback methods are also used to handle topology changes
to the service in which the number of instances of a role is changed.

ROLEINSTANCE

77


The RoleInstance class represents an instance of a role. It is declared:

public abstract class RoleInstance {
public abstract Int32 FaultDomain { get; }
public abstract String Id { get; }
public abstract IDictionary<String,RoleInstanceEndpoint> InstanceEndpoints
{ get; }
public abstract Role Role { get; }
public abstract Int32 UpdateDomain { get; }
}

FaultDomain and UpdateDomain specify respectively the fault domain and upgrade domain for the
instance. Role identifies the role and Id uniquely identifies the instance of the role.
InstanceEndpoints is an IDictionary<> linking the name of each instance endpoint specified in the
Service Definition file with the actual definition of the RoleInstanceEndpoint. Note that each instance
of a role has distinct actual RoleInstanceEndpoint for each specific instance endpoint defined in the
Service Definition file.

ROLEINSTANCEENDPOINT

The RoleInstanceEndpoint class represents an input endpoint or internal endpoint associated with
an instance. It has two properties: RoleInstance identifying the instance associated with the
endpoint; and IPEndpoint containing the local IP address of the instance and the port number for
the endpoint.

LOCALRESOURCE

LocalResource represents the local storage, on the file system of the instance, defined for the role in
the Service Definition file. Each instance has its own local storage that is not accessible from other
instances.

LocalResource exposes three read-only properties: Name uniquely identifying the local storage;
MaximumSizeInMegabytes specifying the maximum amount of space available; RootPath specifying
the root path of the local storage in the local file system.

78


CHAPTER 4: SQL AZURE

CONNECTING TO SQL AZURE IN 5 MINUTES

By Juliën Hanssens

"Put your data in the cloud!" Think about it… no more client side database deployment, no more
configuring of servers, yet with your data mirrored and still accessible using comfortable familiarities
for SQL Server developers. That’s SQL Azure. In this article we will quickly boost you up to speed on
how to get started with your own SQL Azure instance in less than five minutes!

PREREQUISITE – GET A SQL AZURE ACCOUNT

Let’s assume you already have a SQL Azure account. If not, you’re free to try one of the special offers
that Microsoft has available on Azure.com[1] like the free-of-charge Introductory Special or the offer
that is available for MSDN Premium subscribers.

WORKING WITH THE SQL AZURE PORTAL

With a SQL Azure account at your disposal, you first need to login to the SQL Azure Portal[2]. This is
your dashboard for managing your own server instances. The first time you login to the SQL Azure
Portal, and after first accepting the Terms of Use, you will be asked to create a server instance for
SQL Azure like the screenshot below illustrates:

1: Create a server through the SQL Azure Portal

Providing a username and password is pretty straight forward. Do notice that these credentials will
be the equivalent of your “sa” SQL Server account, for which logically strong password rules apply.
And certain user names are not allowed for security reasons. With the location option you can select
the physical location of the datacenter at which your server instance will be hosted. It is advisable to
select the geographical location nearest to your – or your users - needs.

79


Once you press the Create Server button it takes a second or two to initialize your fresh, new server
and you’ll be redirected to the Server Administration subsection. Congratulations, you’ve just
performed a “SQL Server installation in the cloud”!

CREATE A DATABASE THROUGH THE SERVER ADMINISTRATION

Whilst still in the SQL Azure Portal[2] Server Administration section our server details are list, like the
name used for the connection string, and a list of databases. The latter is, by default, only populated
with a 'master' database. Exactly like SQL Server this specific database contains the system-level
information, such as system configuration settings and logon accounts.

We are going to leave the master database untouched and create a new database by pressing the
Create Database button.

2: Create a database through the SQL Azure Portal

On confirmation the database will be created in the “blink of an eye”. But for those who find this too
convenient you can achieve the same result using a slim script like:

CREATE DATABASE SqlAzureSandbox GO

However, in order to be able to feed our database some scripts we need to set security and get our
hands on a management tool. And for the latter why not use the tool we have used since day and
age to connect to our “regular” SQL Server instances: SQL Server Management Studio R2 (SSMS).

CONFIGURING THE FIREWALL

By default you initially cannot connect to SQL Azure with tools like SSMS. At least, not until you
explicitly tell your SQL Azure instance that you want a specific IP address to allow connectivity with
pretty much all administrative privileges.

80


To enable connectivity, add a rule by entering your public IP address in the Firewall Settings tab on
your SQL Azure Portal.

3: Add a firewall rule through the SQL Azure Portal’s Server Administration

Do notice the “Allow Microsoft Services access to this server” checkbox. By enabling this you allow
other Windows Azure services to access your server instance.

CONNECTING USING SQL SERVER MANAGEMENT STUDIO

Having set up everything required for enabling SSMS to manage the database, let’s start using it. If
you haven’t done so already, install the latest R2 release of the SSMS[3] first. Older versions will just
bore you with annoying error messages, so don’t waste time on that. Once in place, boot up the

81


SSMS application, enter the full server name and authenticate using the provided credentials.

4: Connecting SQL Server Management Studio to your SQL Azure instance

No rocket science there either. Optionally you can provide a specific database instance to connect to
in the Options section (more on that later). Once connected, you have a pretty similar environment
with SSMS on SQL Azure as you have on a ‘regular’ SQL Server instance. Although keep in mind that
with the current installment you have to do without the comfortable dialog boxes. This means you
need to brush up your skills with T-SQL. SQL Azure offers a subset, albeit significant subset, of the
familiar T-SQL features and commands you are used to using with SQL Server. This is due to the fact
that SQL Azure is designed natively for the Windows Azure platform

In a nutshell this means that the creation of tables, views, logins, stored procedures etc. by using
scripts is roughly the same in T-SQL syntax but only lacks certain (optional) parameters.

Let’s demonstrate this by creating an arbitrary table. In SSMS right click on the Tables section of our
SqlAzureSandbox database and select “New Table”. The result will be no dialog box with fancy fields,
but a basic SQL script for us to edit. Once modified, it doesn’t really differ from your average SQL
Server script. For example:

-- =========================================
-- Create table template SQL Azure Database
-- =========================================

IF OBJECT_ID('[dbo].[Beer]', 'U') IS NOT NULL
DROP TABLE [dbo].[Beer]
GO

CREATE TABLE [dbo].[Beer]
(
[Id] int NOT NULL,
[BeerName] nvarchar(50) NULL,
[CountryOfOrigin] nvarchar(50) NULL,
[AlcoholPercentage] int NULL,
[DateAdded] datetime NOT NULL,

82


CONSTRAINT
[PK_Beer] PRIMARY KEY CLUSTERED ( [Id] ASC )
)
GO

Once executed the table is generated. This is one thing to take notice off: tables have to be created
through SSMS by default. But once they’re available you can simply boot up Visual Studio and use
the Server Explorer to access them in your project. In fact, you can even use familiar tools with
design-time support like LINQ to SQL, ADO.NET DataSets or Entity Framework for even more
productivity.

APPLICATION CREDENTIALS

Last but not least, a recommendation on security. Up until now we have used our godlike master
credentials for managing our database. We really don’t want these credentials to be included in our
application, so let’s create a lightweight custom user/login for our application to use:

-- 1. Create a login
CREATE LOGIN [ApplicationLogin] WITH PASSWORD = 'I@mR00tB33r'
GO

-- 2. Create a user
CREATE USER [MyBeerApplication]
FOR LOGIN [ApplicationLogin]
WITH DEFAULT_SCHEMA = [db_datareader]
GO

-- 3. And grant it access permissions
GRANT CONNECT TO [MyBeerApplication]
GO

KEEP IN MIND – THE TARGET DATABASE

As you may have noticed all samples lack the USE statement, i.e. “USE *SqlAzureSandbox+”. This is
because the USE <database> command is not supported. With SQL Azure you should keep in mind
that each database can be on a different server and therefore requires a separate connection. With
SSMS you can easily achieve this in the options of the Connect to Server dialog box:

83


5: Connect to a specific database using SQL Server Management Studio

Take notice of this when you are frequently switching between databases. And with that in mind,
the sky is the limit. Even in the cloud.

1.
Microsoft Windows Azure Platform http://www.azure.com
2.
SQL Azure Portal http://sql.azure.com
3.
SQL Server 2008 R2 Management Studio Express http://tinyurl.com/ssmsr2rtm
(SSMSE)

84


CHAPTER 5: WINDOWS AZURE PLATFORM APPFABRIC

REAL TIME TRACING OF AZURE ROLES FROM YOUR DESKTOP

By Richard Prodger

One of the big challenges faced with a deployed Azure hosted role is how to get access to tracing
information. Well, you can use the Azure Diagnostics to collect data in table storage but this is far from
ideal as you have to read the data out and that doesn’t give you real time information. There is a better
way! The .NET Framework already provides the TraceListener that most of you will be familiar with. By
creating your own custom TraceListener, you can push trace messages anywhere you like. Then, by using
the magic provided by the service bus for traversing firewalls, you can pick up these trace messages in an
application running on your desktop.

CUSTOM TRACE LISTENER

We need a client to send the messages and a server to receive them. Let’s start with the Azure client. The
first thing to do is implement the custom TraceListener:

public class AzureTraceListener : TraceListener
{
ITrace traceChannel;

public AzureTraceListener(string serviceNamespace, string servicePath, string
issuerName, string issuerSecret)
{
// Create the endpoint address for the service bus
Uri serviceUri = ServiceBusEnvironment.CreateServiceUri("sb", serviceNamespace,
servicePath);
EndpointAddress endPoint = new EndpointAddress(serviceUri);

// Setup the authentication
TransportClientEndpointBehavior credentials = new
TransportClientEndpointBehavior();
credentials.CredentialType = TransportClientCredentialType.SharedSecret;
credentials.Credentials.SharedSecret.IssuerName = issuerName;

credentials.Credentials.SharedSecret.IssuerSecret = issuerSecret;

// Create the channel and open it
ChannelFactory<ITrace> channelFactory = new ChannelFactory<ITrace>(new
NetEventRelayBinding(), endPoint);
channelFactory.Endpoint.Behaviors.Add(credentials);
traceChannel = channelFactory.CreateChannel();
}

public override void WriteLine(string message)
{
traceChannel.WriteLine(message);
}

public override void Write(string message)
{
traceChannel.Write(message);

85


}
}

As you can see, there is some setup stuff for WCF and the service bus, but basically all you have to do is
override the Write and WriteLineMethods. The ITrace interface is simple as well:

[ServiceContract]
public interface ITrace
{
[OperationContract(IsOneWay=true)]
void WriteLine(string text);

[OperationContract(IsOneWay = true)]
void Write(string text);
}

SEND MESSAGE CONSOLE APPLICATION

Now we need an app to send the messages. For the purposes of this article, I have created a simple
console app, but this could be any Azure role.

static void Main(string[] args)
{
string issuerName = "yourissuerName";
string issuerSecret = "yoursecret";
string serviceNamespace = "yourNamespace";
string servicePath = "tracer";

TraceListener traceListener = new AzureTraceListener(serviceNamespace, servicePath,
issuerName, issuerSecret);
Trace.Listeners.Add(traceListener);
Trace.Listeners.Add(new TextWriterTraceListener(Console.Out));

while (true)
{
Trace.WriteLine("Hello world at " + DateTime.Now.ToString());
Thread.Sleep(1000);
}
}

This simple app simply creates a new custom TraceListener and adds it to the TraceListener’s collection and
that pushes out a timestamp every second. I’ve also added Console.Out as another listener so you can see
what’s being sent.

TRACE SERVICE

So that’s the Azure end done, what about the desktop end? The first thing you have to do is implement the
TraceService that the custom listener will call:

public class TraceService : ITrace
{
public static event ReceivedMessageEventHandler RecievedMessageEvent;

void ITrace.WriteLine(string text)
{

86


RecievedMessageEvent(this, text);
}

void ITrace.Write(string text)
{
RecievedMessageEvent(this, text);
}
}

public delegate void ReceivedMessageEventHandler(object sender, string
message);

The event delegate is there to push out the messages to the app hosting this class.

SERVICE HOST CLASS

Next, we have to create the class that will host the service:

public class AzureTraceReceiver
{
ServiceHost serviceHost;

public AzureTraceReceiver (string serviceNamespace, string servicePath, string
issuerName, string issuerSecret)
{
// Create the endpoint address for the service bus
Uri serviceUri = ServiceBusEnvironment.CreateServiceUri("sb", serviceNamespace,
servicePath);
EndpointAddress endPoint = new EndpointAddress(serviceUri);

// Setup the authentication
TransportClientEndpointBehavior credentials = new
TransportClientEndpointBehavior();
credentials.CredentialType = TransportClientCredentialType.SharedSecret;
credentials.Credentials.SharedSecret.IssuerName = issuerName;

credentials.Credentials.SharedSecret.IssuerSecret = issuerSecret;

serviceHost = new ServiceHost(typeof(TraceService));

ServiceEndpoint endpoint = serviceHost.AddServiceEndpoint(typeof(ITrace), new
NetEventRelayBinding(), serviceUri);
endpoint.Behaviors.Add(credentials);
}

public void Start()
{
serviceHost.Open();
}

public void Stop()
{
serviceHost.Close();
}
}

This is basic WCF code, nothing special here. All we do is create some credentials for authenticating with
the service bus, create an endpoint, add the credentials and start up the service host.

87


SERVICE

Now all we have to do is implement the desktop app. Again, for simplicity, I am creating a simple console
app:

static void Main(string[] args)
{
Console.Write("AZURE Trace Listener Sample started.nRegistering with Service
Bus...");

string issuerName = "yourissuerName";
string issuerSecret = "yoursecret";
string serviceNamespace = "yourNamespace";
string servicePath = "tracer";

// Start up the receiver
AzureTraceReceiver receiver = new AzureTraceReceiver(serviceNamespace, servicePath,
issuerName, issuerSecret);
receiver.Start();

// Hook up the event handler for incoming messages
TraceService.RecievedMessageEvent += new
ReceivedMessageEventHandler(TraceService_myEvent);

// Now, just hang around and wait!
Console.WriteLine("DONEnWaiting for trace messages...");
string input = Console.ReadLine();

receiver.Stop();
}

static void TraceService_myEvent(object sender, string message)
{
Console.WriteLine(message);
}

This app simply instantiates the receiver class and starts the service host. An event handler is registered
and then just waits for messages. When the client sends a trace message the event handler fires and the
message is written to the console.

You may have noticed that I have used the NetEventRelayBinding for the service bus. This was deliberate as
it allows you to hook up multiple server ends to receive the messages in a classic pub/sub pattern. This
means you can run multiple instances of this server on multiple machines and they all receive the same
messages. You can use other bindings if required. Another advantage of this binding is that you don't have
to have any apps listening, but bear in mind you will be charged for the connection whether you are
listening or not, although you won’t have to pay for the outbound bandwidth. I put all the WCF and service
bus setup into the code, but this could easily be placed into a configuration file. I prefer it this way as I have
a blind spot when it comes to reading WCF config in xml and I always get it wrong, but it does mean you
can’t change the bindings without recompiling.

SUMMARY

There is more that could be done in the TraceListener class to improve thread safety, error handling and to
ensure that the service bus channel is available when you want to use it, but I’ll leave that up to you. This

88


code was first put together whilst the AppFabric ServiceBus was in private beta. Microsoft has now
included a version of this code in their SDK samples, so take a look there.

So that's it. You now have the ability to monitor your Azure roles from anywhere.

89


MEET THE AUTHORS

ERIC NELSON
After many years of developing on UNIX/RDBMS (and being able to get
mortgages) Eric joined Microsoft in 1996 as a Technical Evangelist (and
stopped being able to get mortgages due to his new 'unusual job title'
in the words of his bank manager). He has spent most of his time
working with ISVs to help them architect solutions which make use of
the latest Microsoft technologies - from the beta of ASP 1.0 through to
ASP.NET, from MTS to WCF/WF and from the beta of SQL Server 6.5
through to SQL Server 2008. Along the way he has met lots of smart
and fun developers - and been completely stumped by many of their
questions! In July 2008 he switched role from an Application Architect
to a Developer Evangelist in the Developer and Platform Group.

Website: http://www.ericnelson.co.uk
Email: eric.nelson@microsoft.com
Blog: http://geekswithblogs.net/iupdateable
Twitter: http://twitter.com/ericnel

MARCUS TILLETT
Marcus Tillett is currently the Head of Technology at Dot Net
Solutions, where he currently heads the technical team of
architects and developers. Having been building solutions with
Microsoft technologies for more than 10 years, his expertise is in
software architecture and application development. He is
passionate about understanding and using the latest cutting-
edge technology. He is author of “Thinking of... Delivering
Solutions on the Windows Azure Platform?”
(http://www.bit.ly/a0P02n).

Head of Technology at Dot Net Solutions
Twitter: @drmarcustillett
Blog: http://www.dotnetsolutions.co.uk/blog

90


RICHARD PRODGER
Richard Prodger is a founding Technical Director of Active Web
Solutions with more than 25 years experience in the R&D and
computing sectors. Richard’s primary responsibilities are technical
strategy and systems development. Richard is the Director responsible
for the AWS Technology Centre.
Prior to joining AWS, Richard managed BT's Web Services unit. At BT,
Richard was responsible for implementing large scale e-commerce and
web based systems and for translating emerging technology into
practical business solutions.
Richard was the principal architect and technical design authority for
the multi-award winning RNLI Sea Safety system. More recently,
Richard has been working closely with Microsoft on their cloud services
platform, Windows Azure.

Technical Director of Active Web Solutions
www.aws.net

SAKSHAM GAUTAM
Saksham Gautam started working with Windows Azure right from the
early stages of the development of the platform. He is an MCTS on
WCF, and he graduated with Bachelors in Computer Science in 2007.
Since then, he has been working as a Software Developer for AWS.
Saksham is one of the architects and the lead developer for porting the
existing on-premise sea safety system to Windows Azure. Apart from
Azure and .NET, he is interested in distributed systems composed of
heterogeneous components. He presented his work on interoperability
in Windows Azure at the Architect Insight Conference 2010. He is
currently based in Prague and builds interesting software, particularly
using C#.NET.

Software Developer/Architect, Active Web Solutions
Twitter: @sakshamgautam
Blog: http://sakshamgautam.blogspot.com

91


STEVE TOWLER
Steve Towler is a Senior Software Developer for Active Web
Solutions in Ipswich and has been working with Windows Azure
since April 2009. In that time he has helped develop a number of
applications hosted in Windows Azure including a CAD drawing
collaboration tool and a location based services application.
Steve has also been conducting a number of Azure Assessment
Days in conjunction with Microsoft and promoting the benefits
of cloud computing.

Senior Software Developer, Active Web Solutions
Blog: http://www.stevetowler.co.uk

ROB BLACKWELL
Rob Blackwell is R&D Director at Ipswich based Active Web
Solutions. He was part of a team that won an unprecedented
three British Computer Society awards in 2006 and was a
Microsoft Most Valuable Professional (MVP) in 2007 and 2008.
Rob is a self-confessed language nerd and freely admits that the
real reason he’s interested in running Java on Azure is so that he
can host his spare-time Clojure Lisp experiments.

R&D Director, Active Web Solutions
Blog: www.robblackwell.org.uk
Twitter: http://twitter.com/RobBlackwell

JULIËN HANSSENS
Juliën Hanssens is a Software Engineer and Technical Consultant in
software technologies at Securancy Intelligence, a Dutch IT company.
He can be contacted at j.hanssens@securancy.com

Software Engineer at Securancy Intelligence
Email: j.hanssens@securancy.com

92


SIMON MUNRO
Simon Munro, a senior consultant at London-based EMC
Consulting, has been designing and developing commercial
applications for two decades. Despite this, he still has a deep-
rooted need to write production code every day. Branded as a
thought-leader, Simon enjoys stirring things up and pushing
conformity by challenging acceptable norms and asking difficult
questions. His current endeavors include assisting developers
and customers understand the underlying architectural concepts
around cloud computing.

Senior consultant at EMC Consulting
Blog: http://simonmunro.com
Twitter: @simonmunro

SARANG KULKARNI
Sarang is an Analyst Programmer with Accenture-Avanade
during work hours and a technology nomad after that. He has
been coding for food and gadgets for the past 8 years around all
things Microsoft including ATL/COM, VB6, .Net Framework 2.0
onwards, Winforms, WCF, ASP.Net, WIF and now Windows
Azure; targeting varied assignments ranging from run off the mill
enterprise LOB applications to Astrometry APIs and media
transcoding solutions in the cloud. He dwells at Pune, India with
daughter Saee and wife Prajakta.

Analyst Programmer with Accenture-Avanade
Blog: http://geekswithblogs.net/iunknown
Email: sarangbk@gmail.com

STEVEN NAGY
By day, Steven is a .Net consultant who likes diving deep into the
technologies he is passionate about, and has been learning,
teaching, and presenting on Azure since its first public release at
PDC 08. By night he cackles gleefully basking in the glow of his
laptop screen as thousands of Azure worker roles carry out his
evil bidding.

.NET Consultant
Blog: http://azure.snagy.name/blog/
Twitter: snagy

93


GRACE MOLLISON
Grace’s role as a Platform Architect at EMC bridges the gap
between Infrastructure and Development. Activities range from
supporting the development teams throughout the
development life cycle, liaising between the client, 3rd parties (
e.g Hosting partners) and EMC Consulting as required. Advising
on and architecting the platform. Grace has a lot of enthusiasm
for Public cloud solutions and has been dabbling with Azure
from early betas. Grace was part of the team that developed
the’ See The Difference ‘solution which was built using Windows
Azure and SQL Azure.
Grace is a CISSP (Certified Information Systems Security
Professional).

Grace joined EMC in 2008 from Hogg Robinson where she was
responsible for the design, implementation and ongoing
maintenance, support and evolution of their eCommerce
platform which has BizTalk at its core.

Platform Architect at EMC
Blog: http://consultingblogs.emc.com/gracemollison/

JASON NAPPI
Jason is a Software Architect at SmartPak, where he advances
their eCommerce engine and line-of- business applications. He
has 14 years experience as a developer mainly on the Microsoft
stack, developing applications beginning with VB 6, MTS/COM+,
ASP, through each successive version of the .NET framework and
even picked up a certification along the way. He’s held roles in
variety of industries in the Boston area including, health care,
web hosting, financial services, and eCommerce. Most recently
Jason’s been struggling to keep pace with the Entity Framework,
Silverlight, Azure, ASP.NET MVC, WCF Rest, WCF Data Services,
and the myriad of other technologies pouring out of Redmond
and elsewhere.

Software Architect, SmartPak
Blog: http://blog.nappisite.com

94


JOSH TUCHOLSKI
Josh works for Rosetta as a Senior Technology Associate in the
Microsoft Solution Center where he takes part in helping
Rosetta deliver interactive marketing solutions to clients in the
financial, ecommerce, B2B, and healthcare sectors. He has
experience working with small teams to large enterprise
environments and focuses on WCF service development, RIA
apps, and middle-tier component integration. He strives to
produce simple solutions that create sound technical
architectures and can tell a great story at the end of the day.
Outside of development, he enjoys meeting with students in
computer science and software engineering to pick their brain
and help them prepare for their professional careers. Josh lives
in Ohio with his wife Andrea.

Senior Technology Associate
LinkedIn: http://www.linkedin.com/in/joshtucholski
Twitter: http://www.twitter.com/jtucholski
Blog: http://www.dontforgetyourtodos.com/

DAVID GRISTWOOD
Ever since he wrote his first ‘10 ? “hello world” : goto 10‘
program on a PET computer in the late 70s he has been hooked,
and has worked with computers ever since. During his career,
David has secured a Distinction in Computing Science at
Newcastle University, worked as a freelance computer
journalist, visiting lecturer in Computer Science, a director of a
software company, as well has having designed and developed a
wide range of software, and computer systems.
For the last 15 years David has worked at Microsoft, firstly in its
fledgling consultant service section, then in EMEA as a technical
evangelist. Since Microsoft’s launch of .NET, he has been
focused on the .NET platform, helping design and build a wide
range of systems, from smart clients to web applications, and
more recently, cloud computing with the Windows Azure
platform. He currently works mainly with partner and startups,
and runs and delivers regular technical briefings around the
Microsoft platform, including TechEd Europe, TechDays,
BizSpark Camp, etc.

Twitter: @scroffthebad

95


NEIL MACKENZIE
Neil Mackenzie has been programming since the late Bronze
Age. He learned C++ when the only book available was written
by Bjarne Stroustrup. He has been using SQL Server since v4.2.1
on Windows NT. However, he is only a recent convert to the joys
of .Net Framework and C#. He has been using Windows Azure
since PDC 2008 and regrets the demise of the Live Framework
API. Neil spent many years working in healthcare software and is
currently involved in a stealth data-analytics startup. Neil lives in
San Francisco, CA having noticed the weather there is somewhat
better than it was in Scotland.

Blog: http://nmackenzie.spaces.live.com/blog/
Twitter: @mknz

MARK RENDLE
Mark is currently employed as a Senior Software Architect by
Dot Net Solutions Ltd, creating all manner of software on the
Microsoft stack, including ASP.NET MVC, Windows Azure, WPF
and Silverlight.

His career in software design and development spans three
decades and more programming languages than he can
remember. C# has been his favourite language pretty much
since the first public beta, when you had to write the code in a
text editor and compile it on the command line. Those were the
days. You kids today, with your IntelliSense and your
ReSharpers, don’t know you’re born...

Things vying for Mark’s attention lately include functional
programming, internet-centric applications, the Azure cloud
platform and NoSQL data stores.

Senior Software Architect, Dot Net Solutions Ltd
Blog: http://www.dotnetsolutions.co.uk/blogs/markrendle

96

Windows Azure Platform: Articles from the Trenches, Volume One

More Related Content

What's hot

Similar to Windows Azure Platform: Articles from the Trenches, Volume One

More from Eric Nelson

Recently uploaded

Windows Azure Platform: Articles from the Trenches, Volume One