Design Considerations For Storing With Windows Azure


Published on

Final deck for Software Architect 2009. Download is now enabled.

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • name/value pairs (8kb total)
  • PutBlob = 64Mb MAXMetaData = 8Kb per Blob
  • PutBlock = 4Mb MAX to a maximum of 50GbBlockId = 64 bytes
  • Partition Key – how data is partitionedRow Key – unique in partition, defines sortGoalsKeep partitions small (increased scalability)Specify partition key in common queriesQuery/sort on row key
  • Each Table: PartitionKey (e.g. DocumentName) to ensure scalabilityRowKey (e.g. version number)[fields] for data
  • 64kb per field
  • Use XML Serialization to write the results to local storageIt’s generally faster to hydrate from local storageNot as fast as caching in memory
  • Design Considerations For Storing With Windows Azure

    1. 1. Design considerations for storing data in the cloud with Windows Azure<br />Eric Nelson<br />Microsoft UK<br />Blog:<br />Twitter: and<br />Podcast:<br />Newsletter:<br />Slides, links and background “diary” posts can be found on my blog<br />
    2. 2. <ul><li>Windows Azure Platform 101
    3. 3. Storage in the Cloud
    4. 4. Blobs
    5. 5. Tables
    6. 6. Relational
    7. 7. Queues
    8. 8. Lesson learned</li></ul>Agenda<br />
    9. 9. Windows Azure PLATFORM 101<br />Just in case you had something better to do over the last 18months<br />
    10. 10. 3 Important Services<br />3 Critical Concepts<br />Windows Azure<br />Compute and Storage<br />SQL Azure<br />Storage<br />.NET Services<br />Connecting<br />Computation<br />Web and Worker<br />Storage<br />Table, Blob, Relational<br />Messaging<br />Queues, Service Bus<br />
    11. 11. A simple site<br />“Wow! What a great site!”<br />Database<br />Request<br />Web Tier<br />B/L Tier<br />Browser<br />Response<br />
    12. 12. Under load<br />Browser<br />Browser<br />Database<br />Web Tier<br />B/L Tier<br />Browser<br />“Server Busy”<br />Browser<br />Browser<br />
    13. 13. Under load<br />Browser<br />Browser<br />Database<br />Web Tier<br />B/L Tier<br />Browser<br />“Timeout”<br />Browser<br />Browser<br />
    14. 14. Solve using on-premise<br />Browser<br />p1 p2 p3<br />Web Tier<br />N L B<br />B/L Tier<br />N L B<br />Browser<br />Database<br />Web Tier<br />Browser<br />B/L Tier<br />Browser<br />Web Tier<br />B/L Tier<br />Browser<br />
    15. 15. However…<br />p1 p2 p3<br />“Not so great now…”<br />Web Tier<br />N L B<br />B/L Tier<br />N L B<br />Database<br />Web Tier<br />Browser<br />B/L Tier<br />Web Tier<br />B/L Tier<br />“That took a lot of work - and money!”<br />“Hmmm... Most of this stuff is sitting idle...”<br />
    16. 16. Solve using the Cloud aka Windows Azure Platform<br />Browser<br />p1 p2 p3<br />Web Role<br />N L B<br />Worker Role<br />N L B<br />Browser<br />AzureStorage<br />Web Role<br />Browser<br />Worker Role<br />Worker Role<br />Browser<br />Web Role<br />Browser<br />You don’t see this bit<br />You don’t see this bit<br />You don’t see this bit<br />or…<br />Maybe you do<br />
    17. 17. Solve using the Cloud aka Windows Azure Platform<br />SQLAzure<br />Browser<br />p1 p2 p3<br />Web Role<br />N L B<br />Worker Role<br />N L B<br />Browser<br />AzureStorage<br />Web Role<br />Browser<br />Worker Role<br />Worker Role<br />Browser<br />Web Role<br />Browser<br />You don’t see this bit<br />You don’t see this bit<br />You don’t see this bit<br />Ok, you definitely do<br />
    18. 18. Demo: Windows Azure Portal <br />
    19. 19. Storage in the Cloud…<br />Windows Azure Storage and SQL Azure<br />
    20. 20. Blobs, Tables, Relational<br />
    21. 21. Blobs, Tables, Relational<br />
    22. 22. Blobs stored in Containers<br />1 or more Containers per account<br />Scoping is at container level<br />…/Container/blobpath<br />Blobs<br />Capacity 50GB in CTP<br />Metadata, accessed independently <br />Private or Public container access<br />Blobs<br />
    23. 23. Put a Blob<br />Blob Container<br />PutBlob<br />PUT<br />Azure <br />Blob Storage<br />REST API<br />Client<br /><br />
    24. 24. Get a Blob<br />Blob Container<br />Azure <br />Blob Storage<br />REST API<br />Client<br />GetBlob<br />GET<br /><br />
    25. 25. Get part of a Blob<br />Blob Container<br />Azure <br />Blob Storage<br />REST API<br />Client<br />GetBlob<br />GET<br />Range: bytes=329300 - 730000<br /><br />
    26. 26. Put a LARGE Blob<br />PutBlock(blobname, blockid1, data)<br />Blob Container<br />PutBlock(blobname, blockid7, data)<br />PutBlockList(blobname, blockid1, …, blockidN)<br />Azure <br />Blob Storage<br />REST API<br />Client<br /><br />
    27. 27. Blobs, Tables, Relational<br />
    28. 28. Provides structured storage<br />Massively scalable tables (TBs of data)<br />Self scaling<br />Highly available<br />Durable<br />Familiar and easy-to-use API, layered<br />.NET classes and LINQ<br />ADO.NET Data Services – .NET 3.5 SP1<br />REST – with any platform or language<br />Introduction to Tables<br />
    29. 29. No join<br />No group by<br />No order by<br />“No Schema”<br />Not a Relational Database<br />
    30. 30. Table<br />A Table is a set of Entities (rows)<br />An Entity is a set of Properties (columns)<br />Entity<br />Two “key” properties form unique ID<br />PartitionKey – enables scale<br />RowKey – uniquely ID within a partition<br />Data Model<br />
    31. 31. Key Example – Blog Posts<br />Partition 1<br />Partition 2<br />Getting all of dunnry’s blog posts is fast<br />Single partition<br />Getting all posts after 2008-03-27 is slow<br />Traverse all partitions<br />
    32. 32. Query a Table<br />REST: <br />GET$filter=%20PartitionKey%20eq%20value<br />LINQ:<br />var customers = from o in context.CreateQuery&lt;customer&gt;(“Customer”) where o.PartitionKey == value select o;<br />Azure<br />Table Storage<br />Worker Role<br /><br />
    33. 33. Tradeoff between locality and scalability<br />Considerations<br />Entity group transactions<br />Query efficiency<br />Scalability<br />Flexible Partitioning<br />Choosing a Partition Key<br />
    34. 34. Pick potential keys (common query filters)<br />Order keys by importance<br />If needed, include an additional unique key<br />Use two most important keys as PK, RK<br />Consider concatenating to form keys<br />A Method of Choosing Keys<br />
    35. 35. Non-key queries are scans<br />Improve performance by scoping<br />Usually by partition key<br />But what about by table?<br />3 tables<br />Top 1,000 popular items<br />Top 10,000 popular items<br />Everything<br />Now arbitrary “top 1,000” queries are fast<br />Better locality than clever partition keys<br />Write many is one approach<br />
    36. 36. Demo: Windows Azure Storage<br />
    37. 37. Lessons LearnedAzure Storage<br />Azure tables are *not* a relational database<br />Requires a mind shift<br />Azure tables scale<br />3 - 9s availability<br />Azure tables support exactly one key<br />PartitionKey + RowKey<br />Case Matters<br />No foreign keys<br />No referential integrity<br />No stored procedures<br />
    38. 38. Lessons LearnedAzure Storage<br />Azure Storage Client Library<br />No longer just a “sample”<br />Azure storage is available via REST<br />Not limited to Azure hosted apps<br />Not limited to Microsoft platform or tools<br />Getting the signature correct is the hard part<br />
    39. 39. Lessons LearnedAzure Storage - RESTful<br />REST is *not* TDS<br />Be prepared to parse<br />LINQ and XML classes help<br />Sometimes, string parsing is the best choice<br />Azure storage names are picky<br />So are Azure key values<br />It’s possible to create an entity in a table and not be able to update or delete it<br />
    40. 40. Lessons LearnedAzure Storage – Roundtrips are expensive<br />Often better to pull back more than you need vs. multiple roundtrips<br />LINQ on results in memory is fast & flexible<br />foreach works well too<br />Sort and cache tables on the web tier<br />
    41. 41. Lessons LearnedAzure Storage – Entity Group Transactions<br />Different Entity types in the same table<br />E.g. PK = CustomerId<br />Customer, Order and OrderDetails in the same table<br />
    42. 42. Blobs, Tables, Relational<br />
    43. 43. SQL Azure (July 2009)aka SQL Data Servicesaka SQL Server Data Services<br />
    44. 44. On Premise Programming Model<br />This is what we do on-premise...<br />Data<br />TDS<br />RDBMS<br />Client<br />SQLServer<br />
    45. 45. Same for the cloud? <br />So, is this is what we would like to do in the cloud...<br />Data<br />TDS<br />RDBMS<br />Client<br />SQL Server<br />
    46. 46. SQL Azure can do this<br />Data<br />TDS<br />RDBMS<br />Client<br />SQL Azure<br />
    47. 47. SQL Azure can also do this<br />HTTP<br />TDS<br />RDBMS<br />Browser<br />Web Role<br />SQL Azure<br />
    48. 48. And this!<br />Queue<br />TDS<br />HTTP<br />RDBMS<br />Browser<br />Web Role<br />Worker Role<br />SQL Azure<br />
    49. 49. Which means you can easily migrate from this<br />“The Data Center”<br />TDS<br />HTTP<br />RDBMS<br />Browser<br />Web Tier<br />Bus. Logic<br />SQL Server<br />
    50. 50. To this… Windows Azure and SQL Azure<br />“The Cloud”<br />Queue<br />TDS<br />HTTP<br />RDBMS<br />Browser<br />Web Role<br />Worker Role<br />SQL Azure<br />
    51. 51. Demo: SQL Azure<br />
    52. 52. Lessons LearnedSQL Azure<br />From the database “down” it’s just SQL Server<br />Well, almost …<br />Many tools don’t work today<br />System catalog is different<br />Above the database is taken care of for you<br />You can’t really change anything<br />
    53. 53. Lessons LearnedSQL Azure<br />Tooling<br />SSMS partially works – “good enough”<br />Can not create connection using Visual Studio designer<br />Other tools may work better<br />No BCP (currently)<br />DDL<br />Must be a clustered index on every table<br />No physical file placement<br />No indexed views<br />No “not for replication” constraint allowed<br />No Extended properties<br />Some index options missing (e.g. allow_row_locks, sort_in_tempdb ..)<br />No set ansi_nulls on<br />
    54. 54. Lessons LearnedSQL Azure<br />Types<br />No spatial or hierarchy id<br />No Text/images support. <br />Use nvarchar(max)<br />XML datatype and schema allowed but no XML index or schema collection.<br />Security<br />No integrated security<br />
    55. 55. Lessons LearnedSQL Azure<br />Development<br />No CLR<br />Local temp tables are allowed <br />Global temp tables are not allowed<br />Cannot alter database inside a connection<br />No UDDT’s<br />No ROWGUIDCOL column property<br />
    56. 56. Lessons LearnedSQL Azure vs Windows Azure Tables<br />SQL Server is very familiar<br />SQL Azure *is* SQL Server in the cloud<br />Windows Azure Storage is…very different <br />Make the right choice<br />Understand Azure storage<br />Understand SQL Azure<br />Understand they are totally different<br />You can use both<br />
    57. 57. Lessons Learned SQL Azure vs Windows Azure Tables<br />SQL Azure is not always the best storage option<br />SQL Azure costs more<br />Delivers a *lot* more functionality<br />SQL Azure is more limited on scale<br />
    58. 58. Lessons Learned SQL Azure and Sharding<br />Can be done<br />Many 10GB databases<br />Not fun <br />
    59. 59. Queues<br />
    60. 60. Simple asynchronous dispatch queue<br />Create and delete queues<br />Message:<br />Retrieved at least once<br />Max size 8kb<br />Operations:<br />Enqueue<br />Dequeue<br />RemoveMessage<br />Queues<br />
    61. 61. Using the Cloud for Communications<br /><br />Azure Queue<br />REST<br />Client<br />
    62. 62. Using the Cloud for Communications<br />Company 1<br /><br />Client<br />Azure Queue<br />REST<br />Company 2<br />Client<br />
    63. 63. Using the Cloud for Communications<br />x<br />Company 1<br /><br />Client<br />Azure Queue<br />REST<br />Company 2<br />Client<br />
    64. 64. Using the Cloud for Communications<br />Company 1<br /><br />Client<br />Azure Queue<br />REST<br />Web Role<br />Company 2<br />Client<br />
    65. 65. In Summary<br />
    66. 66. Windows Azure Platform Benefits<br />Windows Azure<br />High Level of Abstraction<br />Hardware<br />Server OS<br />Network Infrastructure<br />Web Server<br />Availability<br />Automated Service Management<br />Scalability<br />Instance & Partitions<br />Developer Experience<br />Familiar Developer Tools<br />SQL Azure<br />Higher Level of Abstraction<br />Hardware<br />Server OS<br />Network Infrastructure<br />Database Server<br />Availability<br />Automated Database Management & Replication<br />Scalability<br />Databases Partitioning<br />Developer Experience<br />Familiar SQL Environment<br />
    67. 67. Resources<br />Slides, links and more<br /><br />Azure Training Kit (August update)<br /><br />Sign up, links to resources etc<br /><br />Rapid provisioning of Windows Azure<br />