Successfully reported this slideshow.
Your SlideShare is downloading. ×

SVCC: Code Shaming and Antipatterns

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 37 Ad

More Related Content

Slideshows for you (20)

Advertisement

Similar to SVCC: Code Shaming and Antipatterns (20)

Recently uploaded (20)

Advertisement

SVCC: Code Shaming and Antipatterns

  1. 1. ... Web Role Azure Cloud Service
  2. 2. Service Bus Web Role Worker Azure Cloud Service
  3. 3. Service Bus Queue Message Batch Process Messages Process Message Process Message ..
  4. 4. Service Bus Queue Message Batch Process Messages Process Message .. Process Message
  5. 5. 00:30.2 00:25.9 00:21.6 00:17.3 00:13.0 00:08.6 00:04.3 00:00.0 Message Type 1 Message Type 2 Message Type 3 Message Type 4 Message Type 5 Message Type 6 Message Type 7 Message Type 8 Variation in Message Processing Avg Min Max
  6. 6. http://channel9.msdn.com/Series/PerfView-Tutorial/Tutorial-12-Wall-Clock-Time-Investigation-Basics
  7. 7. Cloud Service Boundary Load Balancer Web Servers Database App Servers Azure Queue(s)
  8. 8. ... Web Role Azure Cloud Service 500 databases
  9. 9. Azure Load Balancer DB1 DB2 DB3 SrcIp SrcPort DestIp DestPort A.B.C.D 1 E.F.G.H 1433 A.B.C.D 2 E.F.G.H 1433
  10. 10. ... Web Role Azure Cloud Service 500 databases Content moderation service
  11. 11. 450 400 350 300 250 200 150 100 50 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Seconds Web Request Response Latency Avg Latency Response Latency
  12. 12. ... Azure Cloud Service Web Role Worker Blob Queue Azure Storage Account
  13. 13. Query Throughput Latency Reach Every 30 seconds, each device publishes a status update (location, health, etc) 4k – 100k msgs/sec 2000 – 5000 ms Single device Every 10 minutes, a batch job retrieves all of the status updates delivered in the past 10 minutes 2M msgs / 10 minutes 2 minutes All devices On an ad-hoc basis, a user may request the current status and recent history of all of their devices 15 requests / second 500 ms Limited device set On an ad-hoc basis, a user may request a historical time range of all of their devices 5 requests / second 750 ms Limited device set
  14. 14. Pk={Device;Day}, Rk={Timestamp} Payload={fields} STB Readiness This isn’t a relational workload Per-device insert and lookup Periodic batch transfer Per-device lookup Natural fit for table storage Device ID = Pk Data type = Rk Periodic batch transfer Natural fit for blob storage Instance + Timestamp = blob id Buffer and write into blocks Roll over on time interval (10 min) 0101 1101 0111 1101 0111 ... Time/space buffer Table Storage Blob Storage Uri={Minute;Instance} Payload={JSON Data} Querying by device By time - direct { PkRk } lookup By day - direct { Pk } max of 2880 records per partition Batch transfer by time frame Parallel download of all blobs matching timeframe pattern Adding scale capacity 20k operations per storage account,
  15. 15. Where are the scalability bottlenecks? Where are the availability and failure points? Where are the key insight and instrumentation points? Cloud Service Front End Web Role Instance Instance Instance Instance Caching Role Instance Instance Worker Role Instance Databases DB DB DB DB Storage Storage Account Storage Account
  16. 16. http://channel9.msdn.com/Series/FailSafe http://code.msdn.microsoft.com/windowsazure/ContosoSocial-in- Windows-8dd9052c

Editor's Notes

  • Optimize for the most stringent case
    More options for latency insensitive workflows

    Simplicity is king
    Use the simplest, most robust approach that fulfills needs
    But not simpler…
    That’s not always the obvious or familiar one…

    No one, true solution
    Favour of composition of approaches to improve resiliency, reduce complexity

  • Periodic query spike on bulk reporting
    Impact to online operations (30M+ rows)
    Rebalancing
    Moving data between partitions / databases
    Distribution of reference data (relational model)
    Keeping in sync
    Impact of noisy neighbors (Azure SQL DB)
    Variable latency, pushback under heavy load
    Cost of management (SQL IaaS)
    Cost of automation for patching, maintenance

  • Periodic query spike on bulk reporting
    Impact to online operations (30M+ rows)
    Rebalancing
    Moving data between partitions / databases
    Distribution of reference data (relational model)
    Keeping in sync
    Impact of noisy neighbors (Azure SQL DB)
    Variable latency, pushback under heavy load
    Cost of management (SQL IaaS)
    Cost of automation for patching, maintenance

  • Periodic query spike on bulk reporting
    Impact to online operations (30M+ rows)
    Rebalancing
    Moving data between partitions / databases
    Distribution of reference data (relational model)
    Keeping in sync
    Impact of noisy neighbors (Azure SQL DB)
    Variable latency, pushback under heavy load
    Cost of management (SQL IaaS)
    Cost of automation for patching, maintenance

  • Scalability bottlenecks
    No long able to add additional capacity?
    How to add additional scale units?
    Where are the key optimization points?
    Messaging, serialization, asynchronicity
    Availability and failure points
    Node level
    Service level
    DC level
    Operational and instrumentation points

×