^
End Slide
Microsoft Confidential – Internal Use
Microsoft Confidential – Internal Use
Microsoft Confidential – Internal Use
Show
Me the
Model!
Microsoft Confidential – Internal Use
SCALE
^
No convention for URL/URI structure
Level of detail varies greatly across implementations
Discoverability quality varies dramatically
Retrieval and search support is weak
HTTP Verb
API End Point
Data Format for Response
HTTP Code
Caching Capabilities
Response Format
OData Custom Header Tag
Modern Authentication Protocols
OAuth 2.0
OAuth 2.0
WS-Fed, SAML 2.0,
OpenID Connect
OAuth 2.0
Task Operation URI
Create an Order POST http://api.contoso.com/CreateOrder?OrderID=1
Approve an Order POST http://api.contoso.com/ApproveOrder?OrderID=1
Delete the Order POST http://api.contoso.com/DeleteOrder?Order=1
Cancel the Order POST http://api.contoso.com/CancelOrder?Order=1
Task Operation URI
Create an Order POST http://api.contoso.com/Order/1?OrderName=“Contoso Reorder”
Approve the Order PUT http://api.contoso.com/Order/1/approval
Delete the Order DELETE http://api.contoso.com/Order/1
Cancel the Order PUT http://api.contoso.com/Order/1/cancellation
HTTP/1.1 200 OK Server cloudflare-nginx Date: Mon, 06 Jan 2014 15:22:04 GMT Content-
Type: text/html Transfer-Encoding: chunked Connection: keep-alive Set-Cookie: __cfduid=[ommitted];
expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.ycombinator.com; HttpOnly Last-Modified:
Mon, 06 Jan 2014 13:14:48 GMT Vary: Accept-Encoding Expires: Thu, 04 Jan 2024 13:14:48 GMT Cache-
Control: max-age=315352364 Cache-Control: public CF-RAY: [omitted] <html> <head> <link
rel="stylesheet" type="text/css" href="/news.css"> <link rel="shortcut icon" href="/favicon.ico">
<title>Hacker News</title> </head> <body> <center> <table border="0" cellpadding="0"
cellspacing="0" width="85%" bgcolor="#f6f6ef"> <tr> <td bgcolor="#ff6600"> <table border="0"
cellpadding="0" cellspacing="0" width="100%" style="padding:2px"> <tr> <td
style="width:18px;padding-right:4px"> <a href="http://ycombinator.com"> <img src="/y18.gif"
width="18" height="18" style="border:1px #ffffff solid;" /> </a> </td> <td style="line-height:12pt;
height:10px;"> <span class="pagetop"> <b><a href="/news">Hacker News</a></b> </span> </td>
</tr> </table> </td> </tr> <tr style="height:10px"></tr> <tr> <td> Sorry for the
downtime. We hope to be back soon. </td>
Retry logic
Idempotency
Protect calls with
timeouts on
outbound requests
• Fast retries often fail again. Exponential back-off is useful.
• Error codes should provide insight.
• Allows restart of requests which may have partially or fully succeeded.
• Don’t retry on timeouts.
• Queue work for slow retry later.
Compensating
behavior
Last resort
Alternate path
Omission
• Example: Serve stale data from cache or switch to read-only mode.
• Please try again
• Never show:
• Example: Return a message that says transaction process, charge
confirmation will come later in e-mail.
• Example: Browsing items might not include inventory count.
using Microsoft.Practices.TransientFaultHandling;
using Microsoft.Practices.EnterpriseLibrary.Common.Configuration;
using Microsoft.Practices.EnterpriseLibrary.WindowsAzure.TransientFaultHandling; ...
// Get an instance of the RetryManager class.
var retryManager = EnterpriseLibraryContainer.Current.GetInstance<RetryManager>();
// Create a retry policy that uses a retry strategy from the configuration.
var retryPolicy = retryManager.GetRetryPolicy <StorageTransientErrorDetectionStrategy>("Incremental Retry Strategy");
// Receive notifications about retries.
retryPolicy.Retrying += (sender, args) => {
// Log details of the retry.
var msg = String.Format("Retry - Count:{0}, Delay:{1}, Exception:{2}", args.CurrentRetryCount, args.Delay, args.LastException);
// Pass msg to your logging handler of choice…. And choose it wisely!
try {
// Do some work that may result in a transient fault.
var blobs = retryPolicy.ExecuteAction( () => {
// Call a method that uses Windows Azure storage and which may
// throw a transient exception.
this.container.ListBlobs(); });
}
catch (Exception) {
// All the retries failed.
}
<RetryPolicyConfiguration
defaultRetryStrategy="Fixed Interval Retry Strategy"
defaultSqlConnectionRetryStrategy="Backoff Retry Strategy"
defaultSqlCommandRetryStrategy="Incremental Retry Strategy"
defaultAzureStorageRetryStrategy="Fixed Interval Retry Strategy" defaultAzureServiceBusRetryStrategy="Fixed Interval Retry Strategy">
<incremental name="Incremental Retry Strategy" retryIncrement="00:00:01“ retryInterval="00:00:01" maxRetryCount="10" />
<fixedInterval name="Fixed Interval Retry Strategy" retryInterval="00:00:01" maxRetryCount="10" />
<exponentialBackoff name="Backoff Retry Strategy" minBackoff="00:00:01" maxBackoff="00:00:30" deltaBackoff="00:00:10"
maxRetryCount="10" fastFirstRetry="false"/>
</RetryPolicyConfiguration>
Used in
conjunction with
timeouts
Always alert
Used to combat
slow responses
• Counter based action
• Often activates admission control with metering to allow recovery
• Often activates alternative pathway
• Mitigations should have monitored counters too
• Instrument all calls with timers
Fallbacks
Custom Fallback
Fail Fast
Fail Silent
• Client library can provide an invokeable callback method
• Can also use locally available data on API server (cookie or cache) to
generate a fallback response
• When data is required and there’s no good fallback
• Negative UX impact, but keeps API healthy
• Return a null value. Useful if the data is optional
Design the strategy early on
Perform quickly
Return a specific error code
Can be used together w/ Auto-scaling
Consider aggressive auto-scaling if
demands grow very quickly
Ensure a system meets SLA
Handle burst activity
Prevent a single tenant from monopolizing
Help cost-optimize a system by limiting the
maximum resource levels
Combine with auto-scaling
config.MessageHandlers.Add(new ThrottlingHandler(
new InMemoryThrottleStore(),
id => 60,
TimeSpan.FromHours(1)));
config.MessageHandlers.Add(new ThrottlingHandler(
new InMemoryThrottleStore(),
id =>
{if (id == "10.0.0.1")
{
return 5000;
} return 60; },
TimeSpan.FromHours(1)));
Allow 60 requests per hour for all users
Allow 60 requests per hour for a given IP
Source: http://blog.maartenballiauw.be/post/2013/05/28/Throttling-ASPNET-Web-API-calls.aspx
public class MyThrottlingHandler : ThrottlingHandler {
// ...
protected override string GetUserIdentifier(HttpRequestMessage request) {
// your user id generation logic here}
}
}
Override to tailor to your needs
Designing Evolvable Web APIs with ASP.NET
Roy Fielding's Dissertation on REST
Richardson Maturity Model
RESTful Web APIs
Architecting fail safe data services

Architecting fail safe data services

  • 4.
  • 5.
  • 10.
  • 11.
  • 12.
  • 15.
  • 24.
  • 26.
  • 33.
    No convention forURL/URI structure Level of detail varies greatly across implementations Discoverability quality varies dramatically Retrieval and search support is weak
  • 38.
    HTTP Verb API EndPoint Data Format for Response
  • 39.
    HTTP Code Caching Capabilities ResponseFormat OData Custom Header Tag
  • 40.
    Modern Authentication Protocols OAuth2.0 OAuth 2.0 WS-Fed, SAML 2.0, OpenID Connect OAuth 2.0
  • 42.
    Task Operation URI Createan Order POST http://api.contoso.com/CreateOrder?OrderID=1 Approve an Order POST http://api.contoso.com/ApproveOrder?OrderID=1 Delete the Order POST http://api.contoso.com/DeleteOrder?Order=1 Cancel the Order POST http://api.contoso.com/CancelOrder?Order=1 Task Operation URI Create an Order POST http://api.contoso.com/Order/1?OrderName=“Contoso Reorder” Approve the Order PUT http://api.contoso.com/Order/1/approval Delete the Order DELETE http://api.contoso.com/Order/1 Cancel the Order PUT http://api.contoso.com/Order/1/cancellation
  • 46.
    HTTP/1.1 200 OKServer cloudflare-nginx Date: Mon, 06 Jan 2014 15:22:04 GMT Content- Type: text/html Transfer-Encoding: chunked Connection: keep-alive Set-Cookie: __cfduid=[ommitted]; expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.ycombinator.com; HttpOnly Last-Modified: Mon, 06 Jan 2014 13:14:48 GMT Vary: Accept-Encoding Expires: Thu, 04 Jan 2024 13:14:48 GMT Cache- Control: max-age=315352364 Cache-Control: public CF-RAY: [omitted] <html> <head> <link rel="stylesheet" type="text/css" href="/news.css"> <link rel="shortcut icon" href="/favicon.ico"> <title>Hacker News</title> </head> <body> <center> <table border="0" cellpadding="0" cellspacing="0" width="85%" bgcolor="#f6f6ef"> <tr> <td bgcolor="#ff6600"> <table border="0" cellpadding="0" cellspacing="0" width="100%" style="padding:2px"> <tr> <td style="width:18px;padding-right:4px"> <a href="http://ycombinator.com"> <img src="/y18.gif" width="18" height="18" style="border:1px #ffffff solid;" /> </a> </td> <td style="line-height:12pt; height:10px;"> <span class="pagetop"> <b><a href="/news">Hacker News</a></b> </span> </td> </tr> </table> </td> </tr> <tr style="height:10px"></tr> <tr> <td> Sorry for the downtime. We hope to be back soon. </td>
  • 51.
    Retry logic Idempotency Protect callswith timeouts on outbound requests • Fast retries often fail again. Exponential back-off is useful. • Error codes should provide insight. • Allows restart of requests which may have partially or fully succeeded. • Don’t retry on timeouts. • Queue work for slow retry later.
  • 52.
    Compensating behavior Last resort Alternate path Omission •Example: Serve stale data from cache or switch to read-only mode. • Please try again • Never show: • Example: Return a message that says transaction process, charge confirmation will come later in e-mail. • Example: Browsing items might not include inventory count.
  • 53.
    using Microsoft.Practices.TransientFaultHandling; using Microsoft.Practices.EnterpriseLibrary.Common.Configuration; usingMicrosoft.Practices.EnterpriseLibrary.WindowsAzure.TransientFaultHandling; ... // Get an instance of the RetryManager class. var retryManager = EnterpriseLibraryContainer.Current.GetInstance<RetryManager>(); // Create a retry policy that uses a retry strategy from the configuration. var retryPolicy = retryManager.GetRetryPolicy <StorageTransientErrorDetectionStrategy>("Incremental Retry Strategy"); // Receive notifications about retries. retryPolicy.Retrying += (sender, args) => { // Log details of the retry. var msg = String.Format("Retry - Count:{0}, Delay:{1}, Exception:{2}", args.CurrentRetryCount, args.Delay, args.LastException); // Pass msg to your logging handler of choice…. And choose it wisely! try { // Do some work that may result in a transient fault. var blobs = retryPolicy.ExecuteAction( () => { // Call a method that uses Windows Azure storage and which may // throw a transient exception. this.container.ListBlobs(); }); } catch (Exception) { // All the retries failed. } <RetryPolicyConfiguration defaultRetryStrategy="Fixed Interval Retry Strategy" defaultSqlConnectionRetryStrategy="Backoff Retry Strategy" defaultSqlCommandRetryStrategy="Incremental Retry Strategy" defaultAzureStorageRetryStrategy="Fixed Interval Retry Strategy" defaultAzureServiceBusRetryStrategy="Fixed Interval Retry Strategy"> <incremental name="Incremental Retry Strategy" retryIncrement="00:00:01“ retryInterval="00:00:01" maxRetryCount="10" /> <fixedInterval name="Fixed Interval Retry Strategy" retryInterval="00:00:01" maxRetryCount="10" /> <exponentialBackoff name="Backoff Retry Strategy" minBackoff="00:00:01" maxBackoff="00:00:30" deltaBackoff="00:00:10" maxRetryCount="10" fastFirstRetry="false"/> </RetryPolicyConfiguration>
  • 55.
    Used in conjunction with timeouts Alwaysalert Used to combat slow responses • Counter based action • Often activates admission control with metering to allow recovery • Often activates alternative pathway • Mitigations should have monitored counters too • Instrument all calls with timers
  • 56.
    Fallbacks Custom Fallback Fail Fast FailSilent • Client library can provide an invokeable callback method • Can also use locally available data on API server (cookie or cache) to generate a fallback response • When data is required and there’s no good fallback • Negative UX impact, but keeps API healthy • Return a null value. Useful if the data is optional
  • 60.
    Design the strategyearly on Perform quickly Return a specific error code Can be used together w/ Auto-scaling Consider aggressive auto-scaling if demands grow very quickly
  • 61.
    Ensure a systemmeets SLA Handle burst activity Prevent a single tenant from monopolizing Help cost-optimize a system by limiting the maximum resource levels Combine with auto-scaling
  • 62.
    config.MessageHandlers.Add(new ThrottlingHandler( new InMemoryThrottleStore(), id=> 60, TimeSpan.FromHours(1))); config.MessageHandlers.Add(new ThrottlingHandler( new InMemoryThrottleStore(), id => {if (id == "10.0.0.1") { return 5000; } return 60; }, TimeSpan.FromHours(1))); Allow 60 requests per hour for all users Allow 60 requests per hour for a given IP Source: http://blog.maartenballiauw.be/post/2013/05/28/Throttling-ASPNET-Web-API-calls.aspx
  • 63.
    public class MyThrottlingHandler: ThrottlingHandler { // ... protected override string GetUserIdentifier(HttpRequestMessage request) { // your user id generation logic here} } } Override to tailor to your needs
  • 64.
    Designing Evolvable WebAPIs with ASP.NET Roy Fielding's Dissertation on REST Richardson Maturity Model RESTful Web APIs

Editor's Notes

  • #9 And no matter how you got your news – via print, television, or social media – chances are you heard about it.
  • #12 If you focus on scaling up in the cloud, you’re going to be about as happy as this guy. It’s about scaling out vs. scaling up.
  • #13 Wrong, will ferrel – failure is a very good path.
  • #14 In fact, you should Always Be Failing… including in production. You need to
  • #15 Decompose by Workload
  • #20 It doesn’t matter how super your service is, it likely has some constraints.
  • #23 Redundant Deployments
  • #25 Viddy – Social Site focused on Videos
  • #27 Scale Unit
  • #28 Telemetry