Design considerations for storing data in the cloud with Windows AzureEric NelsonMicrosoft UKBlog: http://geekswithblogs.net/iupdateableTwitter: http://twitter.com/ericnel and http://twitter.com/ukmsdnPodcast:  http://bit.ly/msdnpodcastNewsletter: http://msdn.microsoft.com/en-gb/flashSlides, links and background “diary” posts can be found on my blog
Windows Azure Platform 101
Storage in the Cloud
Blobs
Tables
Relational
Queues
Lesson learnedAgenda
Windows Azure PLATFORM 101Just in case you had something better to do over the last 18months
3 Important Services3 Critical ConceptsWindows AzureCompute and StorageSQL AzureStorage.NET ServicesConnectingComputationWeb and WorkerStorageTable, Blob, RelationalMessagingQueues, Service Bus
A simple site“Wow!  What a great site!”DatabaseRequestWeb TierB/L TierBrowserResponse
Under loadBrowserBrowserDatabaseWeb TierB/L TierBrowser“Server Busy”BrowserBrowser
Under loadBrowserBrowserDatabaseWeb TierB/L TierBrowser“Timeout”BrowserBrowser
Solve using on-premiseBrowserp1 p2 p3Web TierN L BB/L TierN L BBrowserDatabaseWeb TierBrowserB/L TierBrowserWeb TierB/L TierBrowser
However…p1 p2 p3“Not so great now…”Web TierN L BB/L TierN L BDatabaseWeb TierBrowserB/L TierWeb TierB/L Tier“That took a lot of work - and money!”“Hmmm...  Most of this stuff is sitting idle...”
Solve using the Cloud aka Windows Azure PlatformBrowserp1 p2 p3Web RoleN L BWorker RoleN L BBrowserAzureStorageWeb RoleBrowserWorker RoleWorker RoleBrowserWeb RoleBrowserYou don’t see this bitYou don’t see this bitYou don’t see this bitor…Maybe you do
Solve using the Cloud aka Windows Azure PlatformSQLAzureBrowserp1 p2 p3Web RoleN L BWorker RoleN L BBrowserAzureStorageWeb RoleBrowserWorker RoleWorker RoleBrowserWeb RoleBrowserYou don’t see this bitYou don’t see this bitYou don’t see this bitOk, you definitely do
Demo:  Windows Azure Portal
Storage in the Cloud…Windows Azure Storage and SQL Azure
Blobs, Tables, Relational
Blobs, Tables, Relational
Blobs stored in Containers1 or more Containers per accountScoping is at container level…/Container/blobpathBlobsCapacity 50GB in CTPMetadata, accessed independently Private or Public container accessBlobs
Put a BlobBlob ContainerPutBlobPUT http://account.blob.core.windows./net/containername/blobnameAzure Blob StorageREST APIClienthttp://account.blob.core.windows.net/containername/blobname
Get a BlobBlob ContainerAzure Blob StorageREST APIClientGetBlobGET http://account.blob.core.windows./net/containername/blobnamehttp://account.blob.core.windows.net/containername/blobname
Get part of a BlobBlob ContainerAzure Blob StorageREST APIClientGetBlobGET http://account.blob.core.windows./net/containername/blobnameRange:  bytes=329300 - 730000http://account.blob.core.windows.net/containername/blobname
Put a LARGE BlobPutBlock(blobname, blockid1, data)Blob ContainerPutBlock(blobname, blockid7, data)PutBlockList(blobname, blockid1, …, blockidN)Azure Blob StorageREST APIClienthttp://account.blob.core.windows.net/containername/blobname
Blobs, Tables, Relational
Provides structured storageMassively scalable tables (TBs of data)Self scalingHighly availableDurableFamiliar and easy-to-use API, layered.NET classes and LINQADO.NET Data Services – .NET 3.5 SP1REST – with any platform or languageIntroduction to Tables
No joinNo group byNo order by“No Schema”Not a Relational Database
TableA Table is a set of Entities (rows)An Entity is a set of Properties (columns)EntityTwo “key” properties form unique IDPartitionKey – enables scaleRowKey – uniquely ID within a partitionData Model
Key Example – Blog PostsPartition 1Partition 2Getting all of dunnry’s blog posts is fastSingle partitionGetting all posts after 2008-03-27 is slowTraverse all partitions
Query a TableREST:  GET http://account.table.core.windows.net/Customer?$filter=%20PartitionKey%20eq%20valueLINQ:var customers = from o in context.CreateQuery<customer>(“Customer”) where o.PartitionKey == value select o;AzureTable StorageWorker Rolehttp://account.table.core.windows.net
Tradeoff between locality and scalabilityConsiderationsEntity group transactionsQuery efficiencyScalabilityFlexible PartitioningChoosing a Partition Key
Pick potential keys (common query filters)Order keys by importanceIf needed, include an additional unique keyUse two most important keys as PK, RKConsider concatenating to form keysA Method of Choosing Keys
Non-key queries are scansImprove performance by scopingUsually by partition keyBut what about by table?3 tablesTop 1,000 popular itemsTop 10,000 popular itemsEverythingNow arbitrary “top 1,000” queries are fastBetter locality than clever partition keysWrite many is one approach
Demo:  Windows Azure Storage
Lessons LearnedAzure StorageAzure tables are *not* a relational databaseRequires a mind shiftAzure tables scale3 - 9s availabilityAzure tables support exactly one keyPartitionKey + RowKeyCase MattersNo foreign keysNo referential integrityNo stored procedures
Lessons LearnedAzure StorageAzure Storage Client LibraryNo longer just a “sample”Azure storage is available via RESTNot limited to Azure hosted appsNot limited to Microsoft platform or toolsGetting the signature correct is the hard part
Lessons LearnedAzure Storage - RESTfulREST is *not* TDSBe prepared to parseLINQ and XML classes helpSometimes, string parsing is the best choiceAzure storage names are pickySo are Azure key valuesIt’s possible to create an entity in a table and not be able to update or delete it
Lessons LearnedAzure Storage – Roundtrips are expensiveOften better to pull back more than you need vs. multiple roundtripsLINQ on results in memory is fast & flexibleforeach works well tooSort and cache tables on the web tier
Lessons LearnedAzure Storage – Entity Group TransactionsDifferent Entity types in the same tableE.g. PK = CustomerIdCustomer, Order and OrderDetails in the same table
Blobs, Tables, Relational
SQL Azure (July 2009)aka SQL Data Servicesaka SQL Server Data Services
On Premise Programming ModelThis is what we do on-premise...DataTDSRDBMSClientSQLServer
Same for the cloud?  So, is this is what we would like to do in the cloud...DataTDSRDBMSClientSQL Server
SQL Azure can do thisDataTDSRDBMSClientSQL Azure
SQL Azure can also do thisHTTPTDSRDBMSBrowserWeb RoleSQL Azure
And this!QueueTDSHTTPRDBMSBrowserWeb RoleWorker RoleSQL Azure
Which means you can easily migrate from this“The Data Center”TDSHTTPRDBMSBrowserWeb TierBus. LogicSQL Server
To this… Windows Azure and SQL Azure“The Cloud”QueueTDSHTTPRDBMSBrowserWeb RoleWorker RoleSQL Azure
Demo:  SQL Azure
Lessons LearnedSQL AzureFrom the database “down” it’s just SQL ServerWell, almost …Many tools don’t work todaySystem catalog is differentAbove the database is taken care of for youYou can’t really change anything
Lessons LearnedSQL AzureToolingSSMS partially works – “good enough”Can not create connection using Visual Studio designerOther tools may work betterNo BCP (currently)DDLMust be a clustered index on every tableNo physical file placementNo indexed viewsNo “not for replication” constraint allowedNo Extended propertiesSome index options missing (e.g. allow_row_locks, sort_in_tempdb ..)No set ansi_nulls on
Lessons LearnedSQL AzureTypesNo spatial or hierarchy idNo Text/images support.  Use nvarchar(max)XML datatype and schema allowed but no XML index or schema collection.SecurityNo integrated security
Lessons LearnedSQL AzureDevelopmentNo CLRLocal temp tables are allowed Global temp tables are not allowedCannot alter database inside a connectionNo UDDT’sNo ROWGUIDCOL column property
Lessons LearnedSQL Azure vs Windows Azure TablesSQL Server is very familiarSQL Azure *is* SQL Server in the cloudWindows Azure Storage is…very different Make the right choiceUnderstand Azure storageUnderstand SQL AzureUnderstand they are totally differentYou can use both
Lessons Learned SQL Azure vs Windows Azure TablesSQL Azure is not always the best storage optionSQL Azure costs moreDelivers a *lot* more functionalitySQL Azure is more limited on scale
Lessons Learned SQL Azure and ShardingCan be doneMany 10GB databasesNot fun 
Queues
Simple asynchronous dispatch queueCreate and delete queuesMessage:Retrieved at least onceMax size 8kbOperations:EnqueueDequeueRemoveMessageQueues
Using the Cloud for Communicationshttp://app.queue.core.windows.net/Azure QueueRESTClient

Design Considerations For Storing With Windows Azure

  • 1.
    Design considerations forstoring data in the cloud with Windows AzureEric NelsonMicrosoft UKBlog: http://geekswithblogs.net/iupdateableTwitter: http://twitter.com/ericnel and http://twitter.com/ukmsdnPodcast: http://bit.ly/msdnpodcastNewsletter: http://msdn.microsoft.com/en-gb/flashSlides, links and background “diary” posts can be found on my blog
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
    Windows Azure PLATFORM101Just in case you had something better to do over the last 18months
  • 10.
    3 Important Services3Critical ConceptsWindows AzureCompute and StorageSQL AzureStorage.NET ServicesConnectingComputationWeb and WorkerStorageTable, Blob, RelationalMessagingQueues, Service Bus
  • 11.
    A simple site“Wow! What a great site!”DatabaseRequestWeb TierB/L TierBrowserResponse
  • 12.
    Under loadBrowserBrowserDatabaseWeb TierB/LTierBrowser“Server Busy”BrowserBrowser
  • 13.
    Under loadBrowserBrowserDatabaseWeb TierB/LTierBrowser“Timeout”BrowserBrowser
  • 14.
    Solve using on-premiseBrowserp1p2 p3Web TierN L BB/L TierN L BBrowserDatabaseWeb TierBrowserB/L TierBrowserWeb TierB/L TierBrowser
  • 15.
    However…p1 p2 p3“Notso great now…”Web TierN L BB/L TierN L BDatabaseWeb TierBrowserB/L TierWeb TierB/L Tier“That took a lot of work - and money!”“Hmmm... Most of this stuff is sitting idle...”
  • 16.
    Solve using theCloud aka Windows Azure PlatformBrowserp1 p2 p3Web RoleN L BWorker RoleN L BBrowserAzureStorageWeb RoleBrowserWorker RoleWorker RoleBrowserWeb RoleBrowserYou don’t see this bitYou don’t see this bitYou don’t see this bitor…Maybe you do
  • 17.
    Solve using theCloud aka Windows Azure PlatformSQLAzureBrowserp1 p2 p3Web RoleN L BWorker RoleN L BBrowserAzureStorageWeb RoleBrowserWorker RoleWorker RoleBrowserWeb RoleBrowserYou don’t see this bitYou don’t see this bitYou don’t see this bitOk, you definitely do
  • 18.
    Demo: WindowsAzure Portal
  • 19.
    Storage in theCloud…Windows Azure Storage and SQL Azure
  • 20.
  • 21.
  • 22.
    Blobs stored inContainers1 or more Containers per accountScoping is at container level…/Container/blobpathBlobsCapacity 50GB in CTPMetadata, accessed independently Private or Public container accessBlobs
  • 23.
    Put a BlobBlobContainerPutBlobPUT http://account.blob.core.windows./net/containername/blobnameAzure Blob StorageREST APIClienthttp://account.blob.core.windows.net/containername/blobname
  • 24.
    Get a BlobBlobContainerAzure Blob StorageREST APIClientGetBlobGET http://account.blob.core.windows./net/containername/blobnamehttp://account.blob.core.windows.net/containername/blobname
  • 25.
    Get part ofa BlobBlob ContainerAzure Blob StorageREST APIClientGetBlobGET http://account.blob.core.windows./net/containername/blobnameRange: bytes=329300 - 730000http://account.blob.core.windows.net/containername/blobname
  • 26.
    Put a LARGEBlobPutBlock(blobname, blockid1, data)Blob ContainerPutBlock(blobname, blockid7, data)PutBlockList(blobname, blockid1, …, blockidN)Azure Blob StorageREST APIClienthttp://account.blob.core.windows.net/containername/blobname
  • 27.
  • 28.
    Provides structured storageMassivelyscalable tables (TBs of data)Self scalingHighly availableDurableFamiliar and easy-to-use API, layered.NET classes and LINQADO.NET Data Services – .NET 3.5 SP1REST – with any platform or languageIntroduction to Tables
  • 29.
    No joinNo groupbyNo order by“No Schema”Not a Relational Database
  • 30.
    TableA Table isa set of Entities (rows)An Entity is a set of Properties (columns)EntityTwo “key” properties form unique IDPartitionKey – enables scaleRowKey – uniquely ID within a partitionData Model
  • 31.
    Key Example –Blog PostsPartition 1Partition 2Getting all of dunnry’s blog posts is fastSingle partitionGetting all posts after 2008-03-27 is slowTraverse all partitions
  • 32.
    Query a TableREST: GET http://account.table.core.windows.net/Customer?$filter=%20PartitionKey%20eq%20valueLINQ:var customers = from o in context.CreateQuery<customer>(“Customer”) where o.PartitionKey == value select o;AzureTable StorageWorker Rolehttp://account.table.core.windows.net
  • 33.
    Tradeoff between localityand scalabilityConsiderationsEntity group transactionsQuery efficiencyScalabilityFlexible PartitioningChoosing a Partition Key
  • 34.
    Pick potential keys(common query filters)Order keys by importanceIf needed, include an additional unique keyUse two most important keys as PK, RKConsider concatenating to form keysA Method of Choosing Keys
  • 35.
    Non-key queries arescansImprove performance by scopingUsually by partition keyBut what about by table?3 tablesTop 1,000 popular itemsTop 10,000 popular itemsEverythingNow arbitrary “top 1,000” queries are fastBetter locality than clever partition keysWrite many is one approach
  • 36.
    Demo: WindowsAzure Storage
  • 37.
    Lessons LearnedAzure StorageAzuretables are *not* a relational databaseRequires a mind shiftAzure tables scale3 - 9s availabilityAzure tables support exactly one keyPartitionKey + RowKeyCase MattersNo foreign keysNo referential integrityNo stored procedures
  • 38.
    Lessons LearnedAzure StorageAzureStorage Client LibraryNo longer just a “sample”Azure storage is available via RESTNot limited to Azure hosted appsNot limited to Microsoft platform or toolsGetting the signature correct is the hard part
  • 39.
    Lessons LearnedAzure Storage- RESTfulREST is *not* TDSBe prepared to parseLINQ and XML classes helpSometimes, string parsing is the best choiceAzure storage names are pickySo are Azure key valuesIt’s possible to create an entity in a table and not be able to update or delete it
  • 40.
    Lessons LearnedAzure Storage– Roundtrips are expensiveOften better to pull back more than you need vs. multiple roundtripsLINQ on results in memory is fast & flexibleforeach works well tooSort and cache tables on the web tier
  • 41.
    Lessons LearnedAzure Storage– Entity Group TransactionsDifferent Entity types in the same tableE.g. PK = CustomerIdCustomer, Order and OrderDetails in the same table
  • 42.
  • 43.
    SQL Azure (July2009)aka SQL Data Servicesaka SQL Server Data Services
  • 44.
    On Premise ProgrammingModelThis is what we do on-premise...DataTDSRDBMSClientSQLServer
  • 45.
    Same for thecloud? So, is this is what we would like to do in the cloud...DataTDSRDBMSClientSQL Server
  • 46.
    SQL Azure cando thisDataTDSRDBMSClientSQL Azure
  • 47.
    SQL Azure canalso do thisHTTPTDSRDBMSBrowserWeb RoleSQL Azure
  • 48.
  • 49.
    Which means youcan easily migrate from this“The Data Center”TDSHTTPRDBMSBrowserWeb TierBus. LogicSQL Server
  • 50.
    To this… WindowsAzure and SQL Azure“The Cloud”QueueTDSHTTPRDBMSBrowserWeb RoleWorker RoleSQL Azure
  • 51.
  • 52.
    Lessons LearnedSQL AzureFromthe database “down” it’s just SQL ServerWell, almost …Many tools don’t work todaySystem catalog is differentAbove the database is taken care of for youYou can’t really change anything
  • 53.
    Lessons LearnedSQL AzureToolingSSMSpartially works – “good enough”Can not create connection using Visual Studio designerOther tools may work betterNo BCP (currently)DDLMust be a clustered index on every tableNo physical file placementNo indexed viewsNo “not for replication” constraint allowedNo Extended propertiesSome index options missing (e.g. allow_row_locks, sort_in_tempdb ..)No set ansi_nulls on
  • 54.
    Lessons LearnedSQL AzureTypesNospatial or hierarchy idNo Text/images support. Use nvarchar(max)XML datatype and schema allowed but no XML index or schema collection.SecurityNo integrated security
  • 55.
    Lessons LearnedSQL AzureDevelopmentNoCLRLocal temp tables are allowed Global temp tables are not allowedCannot alter database inside a connectionNo UDDT’sNo ROWGUIDCOL column property
  • 56.
    Lessons LearnedSQL Azurevs Windows Azure TablesSQL Server is very familiarSQL Azure *is* SQL Server in the cloudWindows Azure Storage is…very different Make the right choiceUnderstand Azure storageUnderstand SQL AzureUnderstand they are totally differentYou can use both
  • 57.
    Lessons Learned SQLAzure vs Windows Azure TablesSQL Azure is not always the best storage optionSQL Azure costs moreDelivers a *lot* more functionalitySQL Azure is more limited on scale
  • 58.
    Lessons Learned SQLAzure and ShardingCan be doneMany 10GB databasesNot fun 
  • 59.
  • 60.
    Simple asynchronous dispatchqueueCreate and delete queuesMessage:Retrieved at least onceMax size 8kbOperations:EnqueueDequeueRemoveMessageQueues
  • 61.
    Using the Cloudfor Communicationshttp://app.queue.core.windows.net/Azure QueueRESTClient

Editor's Notes

  • #17 name/value pairs (8kb total)
  • #18 PutBlob = 64Mb MAXMetaData = 8Kb per Blob
  • #21 PutBlock = 4Mb MAX to a maximum of 50GbBlockId = 64 bytes
  • #26 Partition Key – how data is partitionedRow Key – unique in partition, defines sortGoalsKeep partitions small (increased scalability)Specify partition key in common queriesQuery/sort on row key
  • #27 Each Table: PartitionKey (e.g. DocumentName) to ensure scalabilityRowKey (e.g. version number)[fields] for data
  • #32 64kb per field
  • #35 Use XML Serialization to write the results to local storageIt’s generally faster to hydrate from local storageNot as fast as caching in memory