Futureproofing REST APIs

•Download as PPTX, PDF•

2 likes•1,542 views

REST APIs require just as much design effort as traditional APIs and leverage a unique skillset. To make matters worse, very few people building REST APIs have experience building world-class services. This almost always results in breaking changes. In this talk we will dive deep into concrete examples learned from building some of the world's largest REST APIs. We will examine several specific points of guidance and then attempt to extrapolate more general principles for future-proofing REST APIs.

Software

FUTUREPROOF
REST APIS
Mark Stafford
Microsoft

Fast
forward
• Three decades
• A few degrees
• A wife
• Four! kids

API franca • HTTPAPIs can learn from lingua
francas
• HTTPAPIs should be interoperable
• Not making enough of an effort
• Most HTTP APIs are “bespoke” –
built-to-fit
• There are no bonus points for
originality

Disclaimers • These are my opinions
• The opinions are conversation
starters
• Focus on the future, not the past
• Not exhaustive
• Feedback appreciated

1. Policies
2. Versioning
3. Headers
4. Pagination
5. Separate data & metadata
6. Don’t mint mime types
7. Wrap arrays & primitives
8. Responses should be self-describing
9. Use a consistent query syntax
10. Be consistent elsewhere

Important
policies
• Terms of service
• Privacy
• Deprecation
• Breaking change
• SLAs
• Rate limits
• Licenses
• Support

Before &
after
• Nobody is doing this right
• Twitter gets honorable mention

What
should it
look like?
• All policies in one place
• Somewhere developers will
“stumble” over them
• Refresh developers on policies
• Consider versioning theAPI for
policy changes

3 ways • Path, e.g., http://api/v1/
• Headers, e.g., x-version: 1.0
• Query string, e.g.,
http://api?version=1.0

Versioning
principles
• Version from the beginning
• Version both public and private
APIs

Additional
considerations
• How long to keep a version
around
• How to deprecate a version
• Whether to require clients to
request version
• How to handle breaking changes if
version is not required
• Provide a –pre (or similar) suffix

HEADERS ARE ONLY FOR
DATA CONCEPTUALLY
SCOPED TO THE
REQUEST OR RESPONSE
Rule of thumb

WHAT HAPPENS
WHEN YOU WANT TO
RETURN MULTIPLE
COLLECTIONS?

Options • Cram links header with many links
• Force SELECT (N+1) requests

Collection
Questions
• Can you guarantee an etag for the
collection?
• Where do etags for the embedded
resources go?

CONSIDER HEADER
USAGE ON CASE-BY-
CASE BASIS

Client-
driven
pagination
• Client-driven: skip, take, top, etc
• Client-initiation keeps this from
being a breaking change

Server-
driven
paging
• Data has a tendency to grow
• Server should force pagination to
prevent DoS
• Clients should always be prepared
for paginated collections

Server-
driven
paging
guidance
• Tell clients to expect that any
collection may be paginated
• Decide what a continuation token
is (opaque string? URL?)
• Put continuation tokens in
response body

Server-
driven
paging
examples
HAL
Siren

Ambiguity
is bad
• Forcing consumers to the docs is bad
• Ambiguity limits extensibility

OData
annotations
Target parent
Target sibling

NEW MIME TYPES
OFTEN LEAD TO
BREAKING
CHANGES

Introducing
new
resource
types
• HAL: doesn’t care about type
• Siren: doesn’t care about type
• OData: describes type when
necessary
• JSON-LD: always describes type

Best of both
worlds
• Unambiguous typing
• No media type minting

Before &
after
• Still looking for a real-world example of minting

Array
metadata
• Pagination links
• Count
• Self-link
• Type information

CLIENT ONLY NEEDS
THE RESPONSE TO
INTERPRET THE
RESPONSE

Scenarios • Webhooks
• Push notifications
• Asynchronous clients
• Intermediary processing

VERY LITTLE
CONSISTENCY
ACROSS REST APIS

WORSE, VERY LITTLE
CONSISTENCY
WITHIN REST APIS

Query
needs
• Consistent
• Robust
• NOT original

CONSISTENCY
DOESN’T JUST APPLY
TO QUERY SYNTAX

Areas of
consistency
• Resource paths
• Serialization
• Deterministic responses
• Error codes

First look:
Model-first
HTTP APIs
• Use SwaggerAPI description
format
• Use Swagger tooling
• Introduce newYAML syntax
• Resource model only

3
takeaways
• Does not require HTTP expertise
• Builds consistent HTTP APIs
• Your input necessary

OData • OData can help any HTTP API
• Public standard (OASIS, ISO)

Enjoy the
REST of the
conference
(har, har)

Contact info • Mark Stafford
• Microsoft
• mastaffo@microsoft.com
• http://www.odata.org

Similar to Futureproofing REST APIs

Demand, Media, and Search Analytics at AOLSean Timm

Database theory and modelingElizabeth Smith

Scaling Databases On The CloudImaginea

Scaing databases on the cloudImaginea

Big Data (NJ SQL Server User Group)Don Demcsak

Making your data good enough for sharing.FAIRDOM

Lightning Talk: What You Need to Know Before You Shard in 20 MinutesMongoDB

What is REST?Saeid Zebardast

Austin NoSQL 2011-07-06jimbojsb

The SEO Magic of Structured DataKatherine White (McCann)

RESTful ServicesJason Gerard

CPAN Curationneilbowers

noSQL choiceslugiamaster4

Clean CodeAbdelhakim Bachar

The Rise of NoSQL and Polyglot PersistenceAbdelmonaim Remani

Алексей Веркеенко "Symfony2 & REST API" Fwdays

Eloquent workflow: delivering data from database to client in a right wayРоман Кинякин

Sharding why,what,when, howDavid Murphy

Getting started-php unitmfrost503

Haltech WordPress102Shanta Nathwani

Similar to Futureproofing REST APIs (20)

Demand, Media, and Search Analytics at AOL

Database theory and modeling

Scaling Databases On The Cloud

Scaing databases on the cloud

Big Data (NJ SQL Server User Group)

Making your data good enough for sharing.

Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

What is REST?

Austin NoSQL 2011-07-06

The SEO Magic of Structured Data

RESTful Services

CPAN Curation

noSQL choices

Clean Code

The Rise of NoSQL and Polyglot Persistence

Алексей Веркеенко "Symfony2 & REST API"

Eloquent workflow: delivering data from database to client in a right way

Sharding why,what,when, how

Getting started-php unit

Haltech WordPress102

Recently uploaded

Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions

The Evolution of Karaoke From Analog to App.pdfPower Karaoke

办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea

chapter--4-software-project-planning.pptkotipi9215

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh

why an Opensea Clone Script might be your perfect match.pdfjoe51371421

Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden

What are the features of Vehicle Tracking System?Watsoo Telematics

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app

Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig

Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh

What is Fashion PLM and Why Do You Need ItWave PLM

Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin

Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq

XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan

EY_Graph Database Powered SustainabilityNeo4j

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110

Recently uploaded (20)

Advancing Engineering with AI through the Next Generation of Strategic Projec...

The Evolution of Karaoke From Analog to App.pdf

办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样

chapter--4-software-project-planning.ppt

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...

why an Opensea Clone Script might be your perfect match.pdf

Engage Usergroup 2024 - The Good The Bad_The Ugly

What are the features of Vehicle Tracking System?

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx

Automate your Kamailio Test Calls - Kamailio World 2024

Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝

What is Fashion PLM and Why Do You Need It

Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide

Salesforce Certified Field Service Consultant

XpertSolvers: Your Partner in Building Innovative Software Solutions

EY_Graph Database Powered Sustainability

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...

Futureproofing REST APIs

1. FUTUREPROOF REST APIS Mark Stafford Microsoft

7. Fast forward • Three decades • A few degrees • A wife • Four! kids

8. LESS DREAMING, MORE THINKING

9. HTTP APIS

10. Standards are good

11. API franca • HTTPAPIs can learn from lingua francas • HTTPAPIs should be interoperable • Not making enough of an effort • Most HTTP APIs are “bespoke” – built-to-fit • There are no bonus points for originality

12.

13. THIS TALK IS ABOUT FUTUREPROOFING

14. Disclaimers • These are my opinions • The opinions are conversation starters • Focus on the future, not the past • Not exhaustive • Feedback appreciated

15. 1. Policies 2. Versioning 3. Headers 4. Pagination 5. Separate data & metadata 6. Don’t mint mime types 7. Wrap arrays & primitives 8. Responses should be self-describing 9. Use a consistent query syntax 10. Be consistent elsewhere

16.

17. Important policies • Terms of service • Privacy • Deprecation • Breaking change • SLAs • Rate limits • Licenses • Support

18.

19. Before & after • Nobody is doing this right • Twitter gets honorable mention

20. What should it look like? • All policies in one place • Somewhere developers will “stumble” over them • Refresh developers on policies • Consider versioning theAPI for policy changes

21. 1. Policies 2. Versioning 3. Headers 4. Pagination 5. Separate data & metadata 6. Don’t mint mime types 7. Wrap arrays & primitives 8. Responses should be self-describing 9. Use a consistent query syntax 10. Be consistent elsewhere

22. 3 ways • Path, e.g., http://api/v1/ • Headers, e.g., x-version: 1.0 • Query string, e.g., http://api?version=1.0

23. PATH

24. Versioning principles • Version from the beginning • Version both public and private APIs

25. Additional considerations • How long to keep a version around • How to deprecate a version • Whether to require clients to request version • How to handle breaking changes if version is not required • Provide a –pre (or similar) suffix

26. Before & after

27. 1. Policies 2. Versioning 3. Headers 4. Pagination 5. Separate data & metadata 6. Don’t mint mime types 7. Wrap arrays & primitives 8. Responses should be self-describing 9. Use a consistent query syntax 10. Be consistent elsewhere

28. HEADERS ARE ONLY FOR DATA CONCEPTUALLY SCOPED TO THE REQUEST OR RESPONSE Rule of thumb

29. Pagination example

30. WHAT HAPPENS WHEN YOU WANT TO RETURN MULTIPLE COLLECTIONS?

31. Options • Cram links header with many links • Force SELECT (N+1) requests

32. Etag example

33. Collection Questions • Can you guarantee an etag for the collection? • Where do etags for the embedded resources go?

34. CONSIDER HEADER USAGE ON CASE-BY- CASE BASIS

35. Before & after

36. 1. Policies 2. Versioning 3. Headers 4. Pagination 5. Separate data & metadata 6. Don’t mint mime types 7. Wrap arrays & primitives 8. Responses should be self-describing 9. Use a consistent query syntax 10. Be consistent elsewhere

37. 2 TYPES

38. Client- driven pagination • Client-driven: skip, take, top, etc • Client-initiation keeps this from being a breaking change

39. Server- driven paging • Data has a tendency to grow • Server should force pagination to prevent DoS • Clients should always be prepared for paginated collections

40. Server- driven paging guidance • Tell clients to expect that any collection may be paginated • Decide what a continuation token is (opaque string? URL?) • Put continuation tokens in response body

41. Server- driven paging examples HAL Siren

42. Server- driven paging examples OData

43. Before & after

44. 1. Policies 2. Versioning 3. Headers 4. Pagination 5. Separate data & metadata 6. Don’t mint mime types 7. Wrap arrays & primitives 8. Responses should be self-describing 9. Use a consistent query syntax 10. Be consistent elsewhere

45. Ambiguity is bad • Forcing consumers to the docs is bad • Ambiguity limits extensibility

46. Metadata examples HAL Siren

47. Metadata examples JSON-LD OData

48. OData annotations Target parent Target sibling

49. Before & after

50. 1. Policies 2. Versioning 3. Headers 4. Pagination 5. Separate data & metadata 6. Don’t mint mime types 7. Wrap arrays & primitives 8. Responses should be self-describing 9. Use a consistent query syntax 10. Be consistent elsewhere

51. TO MINT OR NOT TO MINT?

52. NEW MIME TYPES OFTEN LEAD TO BREAKING CHANGES

53. Introducing new resource types • HAL: doesn’t care about type • Siren: doesn’t care about type • OData: describes type when necessary • JSON-LD: always describes type

54. Best of both worlds • Unambiguous typing • No media type minting

55. Before & after • Still looking for a real-world example of minting

56. 1. Policies 2. Versioning 3. Headers 4. Pagination 5. Separate data & metadata 6. Don’t mint mime types 7. Wrap arrays & primitives 8. Responses should be self-describing 9. Use a consistent query syntax 10. Be consistent elsewhere

57. JSON ALLOWS ROOT-LEVEL ARRAYS, BUT…

58. WHERE DOES METADATA GO?

59. Array metadata • Pagination links • Count • Self-link • Type information

60. Also wrap primitives

61. Before & after

62. 1. Policies 2. Versioning 3. Headers 4. Pagination 5. Separate data & metadata 6. Don’t mint mime types 7. Wrap arrays & primitives 8. Responses should be self-describing 9. Use a consistent query syntax 10. Be consistent elsewhere

63. Self-description is a REST constraint

64. CLIENT ONLY NEEDS THE RESPONSE TO INTERPRET THE RESPONSE

65. Means of self- describing JSON-LD OData

66. Github Webhooks

67. Scenarios • Webhooks • Push notifications • Asynchronous clients • Intermediary processing

68. Before & after

69. 1. Policies 2. Versioning 3. Headers 4. Pagination 5. Separate data & metadata 6. Don’t mint mime types 7. Wrap arrays & primitives 8. Responses should be self-describing 9. Use a consistent query syntax 10. Be consistent elsewhere

70. Twitter

71. Facebook

72. Appcelerator

73. VERY LITTLE CONSISTENCY ACROSS REST APIS

74. WORSE, VERY LITTLE CONSISTENCY WITHIN REST APIS

75. Query needs • Consistent • Robust • NOT original

76. Two major options Lucene OData

77. NO BONUS POINTS FOR ORIGINALITY

78. 1. Policies 2. Versioning 3. Headers 4. Pagination 5. Separate data & metadata 6. Don’t mint mime types 7. Wrap arrays & primitives 8. Responses should be self-describing 9. Use a consistent query syntax 10.Be consistent elsewhere

79. CONSISTENCY DOESN’T JUST APPLY TO QUERY SYNTAX

80. Areas of consistency • Resource paths • Serialization • Deterministic responses • Error codes

81. First look: Model-first HTTP APIs • Use SwaggerAPI description format • Use Swagger tooling • Introduce newYAML syntax • Resource model only

82. DEMO

83. 3 takeaways • Does not require HTTP expertise • Builds consistent HTTP APIs • Your input necessary

84.

85. ODATA: WORST BRAND NAME EVER!

86. OData • OData can help any HTTP API • Public standard (OASIS, ISO)

87. Who?

88. Enjoy the REST of the conference (har, har)

89. Contact info • Mark Stafford • Microsoft • mastaffo@microsoft.com • http://www.odata.org

Editor's Notes

When I was a kid, I thought a lot about the future.
I dreamt of the day that I would own a flying car
(which was mostly symbolic since we’d teleport everywhere we wanted to go).
In the future, I’d be healed of any injury in a magic bed.
In the future, I’d have a holodeck in my condo on the 9,431st floor.
And in the future, I’d take my kids to visit the colony on Mars. Everything about the future was positive and amazing.
Fast forward three decades, a few degrees, a wife, and four – yes, four – kids,
and I don’t dream quite as much about the future. I do, however, still think about the future a lot. It’s not a future as exciting as warp speed, nor is it difficult to identify the line between the real and the virtual world, but hey – it’s a step in the right direction.
Part of the effort I spend thinking about the future is spent thinking about HTTP APIs. More specifically, I spend a lot of time thinking about two things.
First, I have this deep, deep desire to see HTTP APIs standardize somewhat. One of the other talks I give is called API franca, and in it I draw examples from human language. We see common languages everywhere. I live in China, where Putonghua, or Mandarin, is the lingua franca. I just spoke last week at a conference in Spain to Spaniards, French, Dutch, Germans, and more. Guess what language we spoke?
HTTP APIs need to learn a lesson from this. English, Putonghua, and many others are examples of lingua francas – shared language. If we care about communicating across boundaries, we need to make a better effort to do so. HTTP APIs purport to be designed for interop, but they certainly aren’t making much of an effort to use a shared language. Everybody does pretty much everything other than HTTP and JSON their own way. One additional note here. I sometimes use the word “bespoke” to refer to the custom APIs I see out there. Bespoke is a fancy – but perfect – word for a custom-built HTTP API. But this is interop. You don’t get bonus points for originality.
But this is not that talk and I need to stay off that soapbox or you won’t be able to get me down.
This is a talk about making an HTTP API more stable over time. It’s a bit foolish to believe that you can avoid any breaking changes, but I do think there’s a lot we can do to minimize breaking changes.
Before we really get into this, I want to qualify my comments in three ways. First, I’m about to express a bunch of opinions. I’m not foolish enough to think that I’m right about everything, but I do have reasons for these opinions, and I’ve found that expressing an opinion is a great way to get the conversation started. What I really want is for the conversation to happen. Whether or not you agree with me is a less important point. (But of course, you should agree with me.) Both the before and after of what we’re about to look at come from real APIs. I’m not making these examples up. Because I work at Microsoft, I already have a big target painted on my back, and because I have an opinion about REST APIs, I have a big target painted on my front, so please forgive me if I don’t cite exactly where the before example came from. I’m trying to derive a point, not point out which APIs did something I think is shortsighted. Also, this is by no means a comprehensive or even prioritized list. These are examples that floated readily to top of mind so I suspect that they happen to be more common, but I’d love to get feedback on things I’ve missed or things you disagree with. So all of that said, let’s learn how to build futureproof APIs.
First things first. Before you write any lines of code, you need to state your policies.
I actually stole this idea from Kin Lane, who is very passionate about the politics of APIs. And to be honest, he’s right. I feel a little foolish that I didn’t identify this as a problem sooner. Here are some of the policies he calls out as important.
05:00 Terms of Use is in my opinion the most important policy to have written down. This policy states the legally allowed and disallowed usage of a given API. Developers who fail to comply with this policy may face legal action, so it’s important to have this policy in place from the very beginning – otherwise it may be difficult to enforce the terms of use. Privacy is another critical policy. The privacy policy is designed to protect everyone involved, and without an appropriate privacy policy in place it might be difficult to grow usage. Furthermore, like the terms of use, establishing a privacy policy after the fact leaves the data committed before its existence in a legal grey area. Kin also says that it’s important to establish a deprecation policy. Preferably this policy would be established long before it’s necessary, and since APIs tend to go through dramatic changes from v1 to v2, your deprecation policy should ideally be in place from the start. Kin doesn’t mention this one but I think it’s equally important to establish a breaking change policy. Establishing exactly what constitutes a breaking change will save you a lot of pain in the long run. For instance, does adding a new field to a resource a breaking change? Is adding a new resource type a breaking change? But of course, this is just the tip of the iceberg. There are many other important policies to have: SLAs, rate limits, code licenses on SDKs and samples, support, etc.
We don’t have time to talk about all of them today, but if you’re interested I’d highly encourage you to grab Kin here at the conference or contact him through http://kinlane.com/contact.
Incidentally, I really wanted to do a “before” and “after” for each of the things I’m asserting but as I dug into this point I realized that pretty much everyone – Microsoft included – is doing a bad job here. Twitter gets honorable mention for making some policies available directly through their API.
So, let’s talk about what it should look like. Your policies need to be readily accessible. It doesn’t do developers any good to have a link buried in the sign-up doc that leads to the terms of service, which has a link that leads to the privacy policy. Policies need to stare developers in the face so you should have all of your policies collected in one place so it’s easy to see what you have to offer. Where they are gathered doesn’t matter quite so much, but they should be in a place where developers will “stumble” over them. Maybe that means on the front page of your documentation. Or if you use something like Swagger to provide machine-readable documentation, maybe we need to get all the right policy links in there. The point is that policies don’t work if they’re hard to find. Make them visible. Even more difficult, policies don’t work if they change or if developers forget what was in the policies. Ideally you will find a way to occasionally remind developers about your API’s policies. In extreme cases where the policies change in a way that will negatively affect usage, you should consider versioning your API to call out the new policies.
Speaking of which, another thing to get right from the very beginning is versioning.
There’s three primary ways to do versioning: in the URL path, in the headers, or in the query string. I’m not here to tell you what the right way to do versioning is
(although pragmatically speaking only one option makes sense).
I am here to tell you that you need to get versioning established from the very first version of your API. This is true whether your API is public or private. In the simplest cases it’s enough to just stick a “v1” someplace.
If you’re dealing with a larger API, though, there are a few things you should consider which may not have occurred to you yet. First of all, back to the deprecation policy we talked about. How long will you keep an API version available? Probably not forever. So how will you deprecate a particular version of an API? How do you communicate that to your customers? Second, will you require callers to communicate the API version they want? This is nearly as religious of an argument as how to do versioning. Requiring callers to communicate the API version they want is the easy way out. It guarantees that you have a way to avoid breaking changes. If you don’t require the version to be requested, you have to say that lack of version means either the newest version or the oldest version of the API. In either case a breaking change is inevitable. One final consideration with respect to versioning: there’s a practice that Microsoft Azure has adopted that I really like. They have a deprecation policy that says we need to communicate to customers at least one year in advance of discontinuing a particular version of an API. But that’s obviously at odds with being able to try things in the API and get customer feedback. So they introduced a –preview suffix that means, “Hey, we want your feedback, but this particular version of the API doesn’t come with the same deprecation schedule as our normal APIs.” I personally think that’s a great idea and I’d like to see more APIs adopt a similar pattern.
11:00 It’s important to carefully consider what belongs in headers. This will prevent fundamental problems in future versions of your API.
My rule of thumb is to only put things that are conceptually scoped to the request/response in the header. If what you’re trying to put on the response isn’t conceptually scoped to the response, put it in the body of the response rather than the headers.
My favorite example of this is pagination. Occasionally I see people like GitHub and Crowdflower publish APIs or guidelines that recommend including pagination tokens in the response headers. This seems to make sense for really basic REST APIs – such as who you’re following on some social media site.
But what happens when you want to bring back both who you’re following and their status updates?
If you want to bring these resources back in a combined fashion, you likely have a pagination token for the users as well as a pagination token for each user’s status updates. Representing that much information in a header would be unnatural. The other option would of course be to not allow the resources to be combined, but that would lead to some variant of the SELECT (N+1) problem where you’d see many more HTTP requests than necessary to satisfy the request.
But pagination is just one example. ETags are another good example. The ETag header works well for a resource that has a proper ETag (for example, a single person or a single status update) but it doesn’t work as well for a collection for a variety of reasons. When you are returning a single person or a single status update, feel free to use the ETag header.
When you’re returning a collection, you’ll have to think more deeply about the issue. Can you guarantee an ETag for the collection? And even if you can guarantee an ETag for the collection, do you have a way to communicate the ETags for each of the resources in the collection? It would be kind of sad to make a request for all your friends and then still have to request each friend individually in order to get the ETag.
So even when headers are an option sometimes, that doesn’t mean they are always the right option. Again, headers are great for information that’s scoped to the request or response. If the information you have to communicate doesn’t conceptually apply to the request or response, I suggest you put it in the payload as that gives you more flexibility for future changes.
14:00 You also need to document pagination from day one. Even if your actual responses aren’t paginated yet, your documentation needs to cover how pagination will happen.
There are two primary types of pagination in REST APIs.
First is client-driven pagination. This is usually done with some combination of query string parameters that have names like skip, take, top, limit, etc. So for example the client says, “please give me the second page of 10 status updates.” We shouldn’t be quite as worried about documenting client-driven pagination. Since this is entirely a client-initiated behavior, there’s no breaking change to worry about. The client has to make a change to see a change in the response.
The other type of pagination is far more worrisome. It’s hard to predict ahead of time how big a data set will grow. Except for a few very contrived scenarios, data will inevitably grow to a size that doesn’t make sense to return in a single payload. In those cases, the server must enforce pagination to prevent denial of service attacks. (You did plan for this, right? Your API will force pagination to prevent the status updates from saturating the network when your service goes viral?) In practically every scenario, clients need to be prepared for a continuation token to be present in some payloads and not others.
If your customers don’t know ahead of time what pagination will look like, they can’t possibly be prepared for it when you do introduce pagination. It’s a breaking change. At the other extreme, if the client knows that they should be prepared for pagination in every collection and they know what pagination looks like, then introducing server-side pagination on an existing collection is not a breaking change. We started discussing pagination when we were talking about headers. I do want to reiterate that I firmly believe pagination should be done in the payload. We see REST APIs all the time that – whether they support hypermedia or not – have collections. Going back to our example of friends and their status updates, those are likely to be different resources but in reality there are many situations where it would be ideal to retrieve them together. Other examples where you might want to retrieve related resources are: customers and orders, saved maps and the points saved on them, categories and products. There are obviously many ways to model these sorts of relationships in a REST API. My point is not which is the right pattern, but rather that planning for pagination in the response body gives you more options.
Most of the examples I’ve looked at have a way of correlating either a single link or a links object with a collection. Some styles (such as HAL or Siren) use a “links” object to contain multiple links.
Other styles (such as OData or JSON-LD) put the links inline and separate them using a metadata indicator, which we’ll talk more about in just a second.
18:00 A frustrating percentage of APIs intermingle data and metadata with no apparent distinction between the two.
For instance, consider this response. Apart from reading the documentation, there’s no clear way to distinguish what is data and what is metadata – information about the data. Having to go read the docs to understand that count and next are metadata is a bad thing. Even if it didn’t require additional effort, the intermingling of data and metadata here limit extensibility in ways. For instance, I could never introduce a property that has the same name as a metadata keyword. That might not be a big deal for these examples, but we’ve seen little issues like this cause problems time and time again.
Siren deals with this by reserving the property names of everything but the properties object. HAL prefixes reserved property names with an underscore.
JSON-LD and OData both embed “at” symbols in metadata property names.
I’ll abuse the pulpit here just for a minute to say that the syntax in OData is designed to support both property and parent targeting with metadata information. If you want to say that the parent object is a particular type (which is definitely metadata), there is no prefix to the metadata property name. If you want to say that a primitive property is a particular type, you can target the property by prefixing the “at” symbol with the property name. This enables OData to not have to introduce object structures just to supply metadata for primitive or collection properties. Regardless, every major JSON format I’ve looked at goes to some lengths to distinguish between metadata and data. Oftentimes bespoke APIs don’t think about this and wind up running into problems down the road.
20:30 While we’re talking about the major JSON formats, let’s get contentious again.
If you really want to get a REST wonk frothing at the mouth, ask them whether the REST API should use a common mime type or whether a unique mime type should be created for each type of resource.
Given the context of our conversation, I’d hope that my position is fairly clear. It’s very difficult to have future-proof APIs if you’re inventing new mime types all the time. It might be possible to restrict the new mime types to new API calls and by doing so not introduce a breaking change, but I have often seen issues where authors resist creating new mime types until they have to, and then when the new mime type is finally created, existing API calls are adjusted to use the new mime type. Let’s put that aside for the moment. It’s not theoretically possible to invent new mime types and have a client stand a fair chance of being able to interpret that new resource type. It is absolutely possible and in some cases already supported to introduce new resource types using the major JSON formats.
Again, things like HAL and Siren simply don’t care about type. They simply state the properties and the relationships and leave everything else up to the client to figure out. So introducing new resource types here is incredibly trivial because every resource type looks the same. OData and JSON-LD have a first-class way of representing type – and yes, that’s a good thing, since many clients need to be able to distinguish between a property that is a date/time property and one that looks like a date/time but is actually a string. Much like HAL or Siren, both JSON-LD and OData can use this type annotation system to introduce new types without changing the media type of the payload.
So now we have the best of both worlds – unambiguous types and no need to mint media types.
23:00 We’ve mentioned some form of metadata a couple of times now. One thing that is easy to get wrong when you first build a REST API is an array.
It’s perfectly legal from a JSON perspective to have a root-level array.
One of the things we’ve learned on the OData journey is that it’s not a good idea to have a root-level array. The rationale is fairly simple – where are you going to put that additional information?
When you introduce pagination, where will the pagination go? Yes, you could say that you’ll follow the Github or Heroku guidance to put pagination in the headers (although I hope you don’t) – but then what happens when you need to give clients a rough estimate of how many items are in the collection? Will you make a header for that too? There are several other examples of information you might need to include about a collection. The best solution we’ve found is that when an array shows up at root level, you should wrap it in a JSON object. That gives you a place to put additional data like count, pagination links, self-links, type, etc.
In OData we do the same thing for the same reason with primitive types. This avoids breaking changes down the road.
25:00 Another consideration to try to get in place from the very beginning is a means of allowing responses to be self-describing.
This is a fundamental principle of REST and while HTTP APIs might not run into the need for self-describing responses immediately, the importance becomes self-evident as soon as you try to uncouple a request and response.
Self-describing means that a client does not require anything other than the response to interpret the response.
OData and JSON-LD achieve self-descriptiveness by embedding one or more “context” URLs in each response.
Github’s webhooks achieve self-descriptiveness by adding a header (and yes, it is conceptually scoped to the response) that says what event is firing.
Self-describing responses are obviously useful in scenarios like webhooks or similar push notifications since there is no request to correlate to. But self-descriptiveness is also surprisingly useful to asynchronous clients.
27:00 The thing that annoys me most about today’s REST and HTTP APIs is the apparent randomness of the query patterns.
Twitter uses the count query string parameter to limit the number of search results. But you can use either “count” or “cursor” to limit the number of list members returns. And I’ll leave it as an exercise for you to figure out what semi-optional means here.
Facebook and Github, as far as I can tell, don’t allow client-side paging for things like comments.
Still other APIs use query string options like “page” or “take”.
The fact of the matter is that there’s very little consistency across REST APIs. You can’t take what you learned in some other REST API and apply it to a new REST API you’re trying to consume.
Even worse, you can’t take what you learned about accessing users and apply it to accessing status updates. Every time we continue to propagate bespoke REST APIs, we miss an opportunity to make life better for those consuming the API.
At a minimum, the query language used within a REST API should be consistent and preferably robust. We should not be inventing new query syntaxes (such as Twitter’s search for REST with a positive sentiment). We should exhaust all other options for query syntax before we invent something new.
Two major options for search syntax are Lucene and OData. Lucene has a syntax highly optimized for full-text search but since they include field qualifiers, it would be possible to phrase a fairly complex query using Lucene syntax. OData’s syntax is optimized more for related resources and as such allows you to search for things that you couldn’t do with Lucene. For instance, you could ask for all customers from the US that have any order totaling more than $5000 and specify that any order totaling more than $1000 should be included in the response. OData’s syntax also provides consumers a consistent way to do client-side paging, order results, et cetera.
Even if you don’t use Lucene or OData, please at least make every effort not to invent your own thing. We have enough novelty in this field already. You don’t get bonus points for originality. And most importantly as it regards this talk, you won’t have to come back two versions later and iron out all the rough edges. Because you started with something consistent, you won’t need to make breaking changes to improve the consistency of your query syntax.
30:00
Consistency doesn’t just apply to query syntax. There are a myriad ways we can apply the consistency principle to the design of REST APIs.
Resource paths should be consistent. Response serialization should be consistent. A response should always be deterministic – the consumer shouldn’t have to guess whether they will get an object or an array. In other words, if you wrap top-level arrays and primitives, always wrap top-level arrays and primitives. Return consistent error codes.
I don’t have enough time to demonstrate this tool properly, but I want to show you something brand-spanking-new. This tool has never been demoed at any conference, internal or external. At first glance, the tool looks a lot like the Swagger Editor. That’s because we forked the Swagger Editor to prototype the tool. We love Swagger in general but we really don’t like the amount of effort required to build Swagger docs. Even with the Swagger Editor YAML syntax, describing an API requires too much effort and results in too much inconsistency. One really amazing thing OData has to offer is consistency. An OData API is extremely predictable. So we took the conventions from OData, an extremely terse YAML format describing the resource model, not the API, and voila, we have the lamely-named model-first tool.
In this tool we can describe an API by just describing the root and types associated with the API. Everything else is generated for us. This tool has two benefits that I think are most important. First, it doesn’t require advanced knowledge of HTTP APIs or REST. You don’t need to know which HTTP method should be used for create versus which should be used for update. You don’t need to understand how to construct paths to resources. All you need to know is how to construct a resource model (which most OO programmers already know) and a very small additional concept, the service root. Second and more importantly, using this tool gives you almost all of the consistency of OData. It’s not perfect fidelity but it’s a lot better than you’re likely to get slinging Swagger docs by hand.
If these principles were useful for you in general, I want to share where they come from. I’ve worked on the OData team at Microsoft for the past three and a half years. In that time I’ve seen and consulted on hundreds of REST APIs from tiny little internal APIs to massive REST APIs serving millions of calls per minute. You learn a lot about REST APIs when that’s all you work on, day in and day out.
One thing I’ve learned is that OData is perhaps the worst brand name ever. It made sense at the time that we coined it – we were scaffolding REST APIs on top of relational databases.
But I do want to tell you that OData is good for so much more than that. I truly believe any HTTP API – whether it’s resource-oriented or whether it’s RPC-oriented – can learn something from OData. OData is now an official public standard under the OASIS standardization body and is going through ISO standardization right now.
It’s been adopted by the mega-giant software companies like Microsoft, SAP, and IBM, it’s increasingly being adopted by SaaS providers like Salesforce, Sage, and SuccessFactors, and data integrators like Jitterbit, Informatica, and Acumatica all use OData as a primary tool in their toolbelt.
OData sometimes gets a bad rap, but I sincerely believe it’s undeserved and I’m happy to provide evidence to back that up. If you’re interested in discussing more or looking at how OData can help your company, please feel free to reach out to me. I’ll be at the conference for the rest of today and tomorrow, and this QR code has my contact info. I encourage you to use it. I sincerely thank you for your time and attention, and I hope you enjoy the REST of the conference. (Har, har.)

Futureproofing REST APIs

Recommended

Recommended

More Related Content

Similar to Futureproofing REST APIs

Similar to Futureproofing REST APIs (20)

Recently uploaded

Recently uploaded (20)

Futureproofing REST APIs

Editor's Notes