Skip to content

Instantly share code, notes, and snippets.

@aviflax
Created April 23, 2012 23:32
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aviflax/20df21ec725c05a2a199 to your computer and use it in GitHub Desktop.
Save aviflax/20df21ec725c05a2a199 to your computer and use it in GitHub Desktop.
Avi’s Notes on the Reuters Next REST API v0.6

Notes on the Reuters Next REST API v0.6

by Avi Flax • 23 April 2012

API “Entry Point” URL

I recommend against including the segments “rest” and the version number in the “entry point” URL. The likelihood of needing to add another style of API in the future is very low.

As for the version number, there are many downsides to including the version number in the entry point URL. Instead, I recommend that clients indicate specifics about the representation they desire by utilizing media type parameters.

For example, a client wishing to retrieve a representation of an item resource could send the request header Accept with the value application/json; version=1. This has many advantages; see the above link entitled “many downsides”. (It’s also important to require that all requests include an Accept header, and that it be as specific as possible. This makes it easy to add new representations in the future without affecting existing client-server interactions.)

HTTP Caching

Joe Gregario, an expert in HTTP and REST, recommends using ETags and If-None-Match over Last-Modified and If-Modified-Since:

Many of the things I'm talking about with ETags and If-* headers can also be done with a last modified time served in the Last-Modified: header. In general I advise against using Last-Modified: since it is limited to a one second granularity and you may have issues with clock skew among a group of servers. ETags are just conceptually simpler and just as powerful. This advice is only really for servers, which can decide which cache-validators to support, clients have no such luck and should support both.

ETags are also incredibly useful for preventing users overwriting each other’s changes (known as the “lost update problem”). See the subsection below entitled “Concurrency Control”.

Resources

Grouping

I find the way the resources are grouped very confusing. Each resource should stand on its own. For example, instead of a section entitled “Items & Sources” there should be sections entitled “A collection of Sources”, “A Source”, “An Item”, etc.

Routes

I find the “routes” subsection of each resource highly problematic. This is really an RPC style masquerading as REST. The representations of a resource do not change with the URL used or with the method used, so this structure duplicates much information.

Instead, each resource-section should primarily focus on including and describing one or more example representations.

URLs

Ideally the API would fully embrace the REST constraint “Hypertext As The Engine of Application State” (HATEOAS) by providing to developers (and therefore clients) only a “root” URL for the application thus requiring client appslications to traverse the API in order to work with it. The server would thus be free to change its URL scheme at any time in order to respond to changing circumstances; clients would naturally adapt to such changes because they would be dynamically retrieving and responding to the state of the application afresh with each session. For more, see REST APIs must be hypertext-driven by Roy Fielding.

If, however, it’s decided that for whatever reasons, this API will not conform to HATEOAS, then at least each resource should list only a single URL pattern, and the URL patterns should use URI Templates, a formal approach standardized in RFC-6570, rather than the ad-hoc colon-prefixed syntax based on regular expressions. I.e. instead of /sources/:source-slug use /sources/{source-slug}.

Methods

There’s no need to explain what each HTTP method does with each resource, because that’s defined by the HTTP spec. (This is called taking advantage of a uniform interface.) It would be mostly harmless to include a list of supported HTTP methods for each resource, although it’d be preferable to recommend that developers and/or user-agents use OPTIONS to determine that information either during development or at runtime.

DELETE

Why do no resources support the DELETE method? I’d think this’d be important!

Parameters

Some of the routes include a list of “available” parameters. There’s no indication of how these parameters should be sent — if they’re meant to be query parameters, that should be clarified, because they could just as feasibly be sent as request headers.

X- Headers

The X- prefix has been deprecated. So instead of naming a header X-Focused-On, name it just Focused-On.

The Item Resource

The representation of this resource appears to include an HTML representation of the item’s content, serialized into JSON. It’s important that this serialization be documented in much greater detail. How does it work? For example: what elements can be used? Do the elements support nesting? What character encodings are supported?

The Collection Resource

Why does this resource support only PUT? I’d think adding an item to a collection would be a very common operation, and using POST would be much more efficient. It would also make it more feasible for multiple users to collaborate on adding items to a collection.

Concurrency Control

Any system which will have multiple users working on a shared set of resources needs to account for concurrency control. In a nutshell, this is deciding how to handle the case of two users opening the same exact resource, editing it, and sending updates to the server at about the same time. A naive approach will result in one of the user’s changes being lost — AKA “the lost update problem.”

Since I didn’t see anything in the spec about this, I recommend using ETags to implement MVCC. This entails servers always sending ETags when sending representations, and clients sending the request header If-Match with all PUT requests. The server will check the value of the header and will only process the update if the value matches the current ETag of the current state of the resource. If it doesn’t, the server will reject the request with a 409 response, notifying the client that the changes it tried to send were based on an out-of-date version of the resource, and that if the user wants to continue, they’ll need to “rebase” their changes to the current state of the resource.

I recommend that the server require If-Match to be included with all PUT requests, and that the value * be disallowed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment