Your SlideShare is downloading. ×
Sql saturday azure storage by Anton Vidishchev
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Sql saturday azure storage by Anton Vidishchev


Published on

Sql saturday azure storage by Anton Vidishchev

Sql saturday azure storage by Anton Vidishchev

Published in: Technology, Business

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Windows Azure Storage Overview, Internals and Best Practices
  • 2. Sponsors
  • 3. About me        Program Manager @ Edgar Online, RRD Windows Azure MVP Co-organizer of Odessa .NET User Group Ukrainian IT Awards 2013 Winner – Software Engineering
  • 4. What is Windows Azure Storage?
  • 5. Windows Azure Storage  Cloud Storage - Anywhere and anytime access  Blobs, Disks, Tables and Queues  Highly Durable, Available and Massively Scalable  Easily build “internet scale” applications  10 trillion stored objects  900K request/sec on average (2.3+ trillion per month)  Pay for what you use  Exposed via easy and open REST APIs  Client libraries in .NET, Java, Node.js, Python, PHP, Ruby
  • 6. Abstractions – Blobs and Disks
  • 7. Abstractions – Tables and Queues
  • 8. Data centers
  • 9. Windows Azure Data Storage Concepts Container Blobs https://<account><container> Account Table Entities https://<account><table> Queue Messages https://<account><queue>
  • 10. How is Azure Storage used by Microsoft?
  • 11. Internals
  • 12. Design Goals Highly Available with Strong Consistency  Provide access to data in face of failures/partitioning Durability  Replicate data several times within and across regions Scalability  Need to scale to zettabytes  Provide a global namespace to access data around the world  Automatically scale out and load balance data to meet peak traffic demands
  • 13. Windows Azure Storage Stamps Access blob storage via the URL: http://<account> Data access Storage Location Service LB LB Front-Ends Front-Ends Partition Layer Partition Layer Inter-stamp (Geo) replication DFS Layer DFS Layer Intra-stamp replication Intra-stamp replication Storage Stamp Storage Stamp
  • 14. Architecture Layers inside Stamps Partition Layer Index
  • 15. Availability with Consistency for Writing All writes are appends to the end of a log, which is an append to the last extent in the log Write Consistency across all replicas for an extent:  Appends are ordered the same across all 3 replicas for an extent (file)  Only return success if all 3 replica appends are committed to storage  When extent gets to a certain size or on write failure/LB, seal the extent’s replica set and never append anymore data to it Write Availability: To handle failures during write  Seal extent’s replica set  Append immediately to a new extent (replica set) on 3 other available nodes  Add this new extent to the end of the partition’s log (stream) Partition Layer
  • 16. Availability with Consistency for Reading Read Consistency: Can read from any replica, since data in each replica for an extent is bit-wise identical Read Availability: Send out parallel read requests if first read is taking higher than 95% latency Partition Layer
  • 17. Dynamic Load Balancing – Partition Layer Spreads index/transaction processing across partition servers  Master monitors traffic load/resource utilization on partition servers  Dynamically load balance partitions across servers to achieve better performance/availability  Does not move data around, only reassigns what part of the index a partition server is responsible for Partition Layer Index
  • 18. Dynamic Load Balancing – DFS Layer DFS Read load balancing across replicas  Monitor latency/load on each node/replica; dynamically select what replica to read from and start additional reads in parallel based on 95% latency Partition Layer
  • 19. Architecture Summary  Durability: All data stored with at least 3 replicas  Consistency: All committed data across all 3 replicas are identical  Availability: Can read from any 3 replicas; If any issues writing seal extent and continue appending to new extent  Performance/Scale: Retry based on 95% latencies; Auto scale out and load balance based on load/capacity  Additional details can be found in the SOSP paper:  “Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency”, ACM Symposium on Operating System Principals (SOSP), Oct. 2011
  • 20. Best Practices
  • 21. General .NET Best Practices For Azure Storage  Disable Nagle for small messages (< 1400 b)  ServicePointManager.UseNagleAlgorithm = false;  Disable Expect 100-Continue*  ServicePointManager.Expect100Continue = false;  Increase default connection limit  ServicePointManager.DefaultConnectionLimit = 100; (Or More)  Take advantage of .Net 4.5 GC  GC performance is greatly improved  Background GC:
  • 22. General Best Practices  Locate Storage accounts close to compute/users  Understand Account Scalability targets  Use multiple storage accounts to get more  Distribute your storage accounts across regions  Consider heating up the storage for better performance  Cache critical data sets  To get more request/sec than the account/partition targets  As a Backup data set to fall back on  Distribute load over many partitions and avoid spikes
  • 23. General Best Practices (cont.)  Use HTTPS  Optimize what you send & receive  Blobs: Range reads, Metadata, Head Requests  Tables: Upsert, Projection, Point Queries  Queues: Update Message  Control Parallelism at the application layer  Unbounded Parallelism can lead to slow latencies and throttling  Enable Logging & Metrics on each storage service
  • 24. Blob Best Practices  Try to match your read size with your write size  Avoid reading small ranges on blobs with large blocks  CloudBlockBlob.StreamMinimumReadSizeInBytes/ StreamWriteSizeInBytes  How do I upload a folder the fastest?  Upload multiple blobs simultaneously  How do I upload a blob the fastest?  Use parallel block upload  Concurrency (C)- Multiple workers upload different blobs  Parallelism (P) – Multiple workers upload different blocks for same blob
  • 25. Concurrency Vs. Blob Parallelism • • • C=1, P=1 => Averaged ~ 13. 2 MB/s C=1, P=30 => Averaged ~ 50.72 MB/s C=30, P=1 => Averaged ~ 96.64 MB/s • Single TCP connection is bound by TCP rate control & RTT • P=30 vs. C=30: Test completed almost twice as fast! • Single Blob is bound by the limits of a single partition • Accessing multiple blobs concurrently scales 10000 8000 6000 4000 2000 Time (s) XL VM Uploading 512, 256MB Blobs (Total upload size = 128GB) 0
  • 26. Blob Download  XL VM Downloading 50, 256MB Blobs (Total download size = 12.5GB) C=1, P=1 => Averaged ~ 96 MB/s C=30, P=1 => Averaged ~ 130 MB/s 120 Time (s) • • 140 100 80 60 40 20 0 C=1, P=1 C=30, P=1
  • 27. Table Best Practices  Critical Queries: Select PartitionKey, RowKey to avoid hotspots  Table Scans are expensive – avoid them at all costs for latency sensitive scenarios  Batch: Same PartitionKey for entities that need to be updated together  Schema-less: Store multiple types in same table  Single Index – {PartitionKey, RowKey}: If needed, concatenate columns to form composite keys  Entity Locality: {PartitionKey, RowKey} determines sort order  Store related entites together to reduce IO and improve performance  Table Service Client Layer in 2.1 and 2.2: Dramatic performance improvements and better NoSQL interface
  • 28. Queue Best Practices  Make message processing idempotent: Messages become visible if client worker fails to delete message  Benefit from Update Message: Extend visibility time based on message or save intermittent state  Message Count: Use this to scale workers  Dequeue Count: Use it to identify poison messages or validity of invisibility time used  Blobs to store large messages: Increase throughput by having larger batches  Multiple Queues: To get more than a single queue (partition) target
  • 29. Thank you!  Q&A