# Why would I build my API with REST? (the real one, with hypermedia)

## Abstract

REpresentational State Transfer (REST) is still an innovative network architecture and it
provides solutions to frequent problems in API design and operation. It decouples client and
server, making it possible to quickly experiment, to modifiy business logic and routing on the
fly, as well as make the API operations discoverable by humans. In discussions around REST, it is
often apparent that people don't realize how it would provide those solutions, so they are
illustrated in this article with relevant use cases.

## How I discovered REST

More than 10 years ago, I read [Roy Fielding's PhD thesis](https://ics.uci.edu/~fielding/pubs/dissertation/top.htm) about the design of HTTP, in which he
described a new network architecture, [REpresentational State Transfer (REST)](https://ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm). That discovery
followed a now familiar path:

1. I read about scientific work on a subject X in Computer Science; I'm awed and grateful for the
level of thoroughness the scientists went to, describing a very precise engineering trade-off,
with clear costs and clear benefits,
2. I'm eager to apply this new knowledge and I go look how others did it before me, especially if
some tooling exists that I could use,
3. I discover that very few people actually use X; worse, most people don't even know that X
exists, or when they do, they believe all sorts of unfounded, false things about X

Indeed, REST is pretty awesome. Fielding identified crucial issues that a network architecture
can have and REST is designed to avoid several of them including an important one: coupling
between the clients and the server. When you are building an internal API with very few consumers
that you have some control over, coupling between client and server might not seem like a real
problem (I think it still is, though, and we underestimate the costs of excessive coupling), but
if you are building a public API with lots of consumers you have no control over, it quickly
becomes an obvious and serious problem.

When there is C–S coupling, a lot of modifications to the API end up having one of either effect:

- You break existing clients (and piss off people building or using them),
- you need to maintain several versions of your API in parallel and wait for clients to migrate
to the new one to stop maintaining the old

If this were "just" a nuisance that can slow you down, that you can get used to, that you can
build processes around and that tooling can help make painless, that would not be such a serious
problem. But C–S coupling also creates unsolvable issues:

- you cannot experiment at scale, you can only rollout a new version and pray you don't need to
tell clients to go back to the previous one
- you cannot change the logic that decides when operations should be tried, because that logic
must be implemented by the client
- you cannot do sharding, because clients are in control of where they route their requests

## What it looks like with REST

### How logic can change without changing the client

A frequent argument against REST is that it's made for humans, not for machines. There's a
kernel of truth behind that argument: HTML indeed makes it easy for humans to discover how to
use an HTTP server (if the pages are in a language they understand), without knowing anything in
advance but a single starting URL (even less in the presence of search engines). But the
argument makes two errors.

#### UI-responses

The first error in the argument is that, even when communication occurs between two machines,
it doesn't mean there won't be a human at some point in the whole workflow. A server could
communicate with a REST API and make some decisions autonomously, but still use the data
received from the REST API in order to present choices to a human. That's an extremely frequent
scenario that would actually benefit from REST, in much the same way that browsing the Web
benefits from REST.

Let's call an API response that provides UI options a UI-response.

Now, there are several levels to using UI-responses. The easiest level is when the UI-response
changes what actions are available, for actions that are already known by the UI. It's the
easiest because the designer can choose very carefully where the action will go in the UI and
how it will be displayed (what text to use, how to localize it, what icons to use, how to show
availability, etc…).

A variant of this first level is when the API presents a dynamic collection of options. REST
makes it easier to add new kinds of options if they can be displayed in the same way. For
example, if the user chooses the destination for a message, it might have been previously only
designed for users to be destinations, but groups and bots might become new kinds of
destinations. Without REST, this might have meant that destinations were a string identifier,
which forces all destinations to be named in a single namespace, or if clients knew to send
each message with the URL template `/users/:user/mailbox`, this might make for a clunky
implementation on the server side where the handler for users' mailboxes becomes the router for
all kinds of message destinations.

Another example I faced was a UI that presents possible transitions for a state machine. With
REST, every time the state machine is changed, the client can stay the same. But a more
impressive case is when it becomes important to let users jump several transitions at once,
when conditions were met for each transition. With REST, it is just a matter of adding more
entries in the UI-response, possibly with labels detailing the path taken. But without REST, it
would mean a huge refactoring of both the client, the API and the server for the change to be
available to users.

The next level is when the UI-response provides actions that are unknown to the UI. In many
UIs, it would not be a good idea to just add actions that were not designed to be there. It
might get confusing for a French or Chinese user if an action appears in a menu, called
"unlink-from-account"… But on the other hand, in many UIs targetting narrow categories of users
(like trained employees or power users), such a menu item might be perfectly fine, or a default
mechanism to display unknown actions might be fine, like transforming an action called
"unlink-from-account" in the API into "Unlink from account\*" in the menu.

If the unknown actions are added to a resource that already has menus about it, it might be
quite easy to just add the unknown actions to one of those menus. If the unknown actions are
added to a resource that doesn't have such a menu, it might not be realistic for the UI to
accomodate them without disrupting the design. Other kinds of UIs might make it possible to add
unknown actions, like a command line. With unknown actions, there might also be security and
safety concerns, especially if the API server is outside your organization.

The important aspect is that the discoverability afforded by REST to humans can still be present
even if part of the communication is between machines.

#### Machines between themselves

The second error in the argument is that, for the part of the communication that's purely
between machines, although there's indeed no real discoverability of unknown actions (except if
we use an AI as a REST client…), there's still the issue of moving as much as possible of the
logic from the client to the server.

Even when there's no UI-response (in which case the pace is usually dictated by human actions
in the UI), a client and a server will usually each have internal workflows that move at a
different pace.

For example, let's say ClientA and ServerB communicate autonomously. ClientA will request
ServerB to create a `Foo` resource and ask that it becomes active. When the `Foo` is active,
ClientA will upload the `stage1` file to the `Foo`. Once `stage1` has been validated, ClientA
will upload the `stage2` file to the `Foo`. One pace difference between ClientA and ServerB may
be that ClientA is able to produce both the `stage1` and `stage2` files before they can be
uploaded to ServerB.

In a non-REST API, it's ClientA's responsibility to know the logic of each stage and build the
URLs for each action and decide when to send the requests. In this context, if the owners of
ServerB discover that it would be more convenient for the users of their API to be able to
upload the files as soon as they're available, they need to inform the users through
documentation and users will only benefit when they modify the logic inside their client. If
this new logic is discovered to be a problem for ServerB, they again need to inform users
through documentation and either they are stuck with clients using the problematic logic until
they get fixed or they break the API for the updated clients, which would give an incentive for
clients not to update too quickly the next time. If the problem is not too severe, it will
probably be a bit of both: suffer the new logic until enough clients have been reverted and
break the API for the remaining stragglers.

In a REST API, it's ServerB's responsibility to make the available actions show up in resource
representations when it makes sense. In this context, ClientA would be designed to upload the
various files as soon as the corresponding actions show up. If the owners of ServerB decide to
make them show up sooner in some version of their server, the users will benefit from that
updated logic as soon as it's deployed on that server. And if the owners of ServerB discover an
issue with the updated logic, they can revert the change and every single client will
immediately follow the reverted logic.

In this example, we can consider some of the many possibilities that REST makes easy and C-S
coupling makes impossible:

- ServerB can experiment with the new logic, including with beta testing, canary testing and
A/B testing
- experiments can be shut down as soon as any results are found, positive or negative
- after each experiment, new experiments can be designed and deployed immediately
- this process of successive experiments can include gradually expanding how many users see the
new logic until every user is switched to the new logic (progressive rollouts)

For example, if the ability for every user of the API to upload files preemptively is found
detrimental to the system, any new logic could be tried in quick and concurrent experiments:

- only some category of trusted users can upload immediately
- immediate uploads obey local or global quotas
- immediate upload is a privilege that is granted to all users initially but revoked in case of
abuse

#### The need for self-description

One caveat to the almost magical ability of REST to introduce unknown actions in UI-responses
is the nature of the payload.

In HTML, this problem is solved in forms with the fact that a user is supposed, either from
outside documentation, context or information around the inputs, to known what to put in each
form field, and each field name is provided by the page. If a new form appears in a web page
asking if the user wants to be awarded a free meal, with "yes" and "no" buttons, the nature of
the payload is provided by the server and the human user is able to decide how he wants to use
this new action.

For UI-responses, this means that unknown actions need to follow one of the following patterns
in order for the client to know what payload to send and what HTTP method(s) to use:

1. the API documentation provides a default request for unknown actions (e.g. `POST` with a
special payload, that could simply be an empty payload)
- this means that a client that knowns the full documentation of this action may use other
kinds of requests than this default request
- this also means that when the action is unknown, it's basically used in a restricted or
degraded mode where only a narrower set of capabilities is available
2. the API includes four features similar to HTML forms:
- a URI for the action
- an HTTP method to use
- a description of payload content to be filled by humans
- a description of payload pre-filled by the server
3. the API includes a more complex vocabulary to designate what's expected in the payload, with
respect to human input, objects sent to the client and references to resoures on the server

It is already possible to do #2 at least with [XForms](https://www.w3.org/TR/xforms/), [Siren](https://github.com/kevinswiber/siren) and [Collection+JSON](http://amundsen.com/media-types/collection/format/). It is unclear
in the [Hydra specification](https://www.hydra-cg.com/spec/latest/core/) that it supports providing pre-filled content. Other formats permit
to describe links between resources and resource collections, even when they are not as
expressive as HTML forms (e.g. [JSON:API](https://jsonapi.org/format/), [HAL](https://stateless.group/hal_specification.html)).

HTML itself has actually been used as a format for REST APIs<sup><a id="fnr.1" class="footref" href="#fn.1" role="doc-backlink">1</a></sup>. At its core, it already features the ability to encode
links and their nature (with the `href` and `rel` attributes). Forms can obviously encode
pattern #2. In HTML5, the [microdata](https://html.spec.whatwg.org/multipage/microdata.html) feature adds the ability to encode data in a
machine-readable way.

There may not be many use cases where #2 isn't enough and #3 becomes necessary. One such use
case might be request pipelining as is done in the E language. With HTTP/2 and a API that
follows pattern #3, you could send a request R1 and immediately send a request R2 with data
that references the (yet unknwon) response to R1, and the server could in one sweep answer to
R1 then R2, without additional network roundtrips…

There is a trade-off between self-description and network efficiency. But there are ways to
have both self-description and lightweight responses. If metadata about the API is a separate
resource, it can be fetched only once and cached for future requests (in which case you lose on
the initial added roundtrip and you gain in the weight of the responses).

This also means that, for common uses, we could use shared definitions and make it even easier
for clients to adapt to new actions appearing in the API. For example, an application designer
could include a widget for an operation like "Delegate authority on this resource to someone
with an email address", that's identified by a URI like
`http://example.com/actions/delegate2email`. Several independent API services could, after the
release of the application in production, add such an action to some UI-responses, and the
application would be able to make those actions visible to the user in a way that's consistent
with the established user experience of the application.

Note that the latter doesn't even need self-description: the resource doesn't need to be
metadata about the operation that can be fetched and processed. The URI could just be an
identifier for something that's documented for human developers.

Of course, pattern #3 and self-description have security implications that should be
investigated. If they don't follow secure design principles like the Principle of Least
Authority, they could easily open the door to [confused deputy attacks](https://en.wikipedia.org/wiki/Confused_deputy_problem). Even pattern #2 opens
the possibility of phishing in some contexts.

### Sharding

API sharding consist of permanently handling the resources of your API on different servers,
according to some partitioning scheme. Among the most obvious are:

- dividing users among servers for load balancing
- attributing all users of a single customer on a dedicated server, for SLA or security purposes
- handling some type of resource on a dedicated server, e.g. file upload or payment

The problem without REST is obvious: if your clients decide where they send the requests, it's
impossible to start sharding (or change the sharding scheme if there is one, applied by the
clients). Without REST, when the need for sharding arises, the only real solution is to keep the
known servers as front-end routers that dispatch requests to the actual servers. That way, you
can benefit from sharding in terms of computing power and storage, but not on network usage, on
security or on availiability.

As with business logic, with REST, sharding can be tested, implemented and modified instantly.

## Couldn't I do most of this without full-blown REST?

Actually, most of the issues described here could very well find solutions that don't involve
implementing a full REST API. Known actions could each have an indicator for their availability,
for example. Even some sharding could be implemented with clients discovering which server to
send a request to (e.g. from redirections) and then still build the URLs themselves.

Three elements make me suspect that this would not be a good or realistic idea:

1. If you are designing a non REST API, including features like action availability and
redirections goes against the industry standard and I expect most teams to avoid them until
they become necessary, whereas a REST design makes those features a core requirement from the
get go; without REST, designers will probably implement those solutions as a major breaking
change in the API, which partly defeats their purpose.
2. Once you implement those solutions, you don't end up with a design that's noticeably simpler
than REST. So you might as well use REST (which is more complex to implement than an API with
C–S coupling, given the current tooling).
3. All those non-REST solutions feel like hacks and they certainly don't have behind them the
level of investigation that was put into REST. It feels silly to have such a careful design on
hand and to use instead a solution with uncertain design consequences.

## The issues with REST

Despite all these benefits, REST is very rarely used today. People call "REST" or "RESTful" what
are actually RPC APIs as soon as they conform to the HTTP semantics, with idempotent `GET`
requests, with `PUT` used to store a state, `DELETE` to remove or disable a resource, and `POST`
for all other operations with side effects. Doing this has become an industry standard, which
makes it easy in terms of knowledge and tooling, whereas implementing actual REST is seen as a
leap into the unknown. When people don't see all the benefits, this may not seem like a
reasonable risk to take.

This article intends to address one part of the problem: if people see all the benefits, they may
decide that they are worth the trouble. What remains is still a lack of tooling and knowledge.

### Tooling

When someone wants to design an HTTP RPC API, they have several tools available to help design
the API, test it and help client developers with auto-generated code, as well as auto-generated
documentation. People in the community of Open API Specification (formerly Swagger) have asked
several times for REST to be included but up until now, none of the popular tools support it.

This means that a REST API designer will have to do themselves most of the work that's automated
when using HTTP RPC. It is of course a chicken and egg problem: when tooling isn't available,
very few will do REST, and if very few do REST, tooling developers don't see the need to include
REST in their projects (which is perfectly reasonable from their point of view, as engineering
effort is not cheap).

Worse, once a developer has chosen REST, their API may face difficulties in acceptance and
deployment because potential users may balk at implementing a REST client when they are used to
just auto-generate client code from an OAS document for their language of choice.

There are a few protocols that natively support REST today, but some are based on dated
technologies ([WSDL](https://www.w3.org/TR/wsdl20-primer/) and [OData](https://www.odata.org/) both support REST, but use XML as the data format) and others are
based on more modern technologies like JSON but don't support generating code and documentation
yet.

While no tooling exists today to statically generate documentation from REST API specifications,
some APIs include metadata about the API in their hypermedia links (and it's a native feature in
Hydra).

### Research

The other issue is that if REST is rarely used, it is also rarely investigated by researchers
and there are few retrospectives on its use. As a result, there isn't the same body of known
good practices and people implementing REST APIs will have to discover by themselves what the
traps of REST are.

## Conclusion

As i said before, I think REST is awesome. It is incredibly powerful (litterally, as people don't
believe how powerful it is) and it deserves being pushed into more use. It has the potential to
bring benefits to everyone in the API chain: API designers, API developers and operators, client
developers and end-users. We all deserve REST being more present in our tech!

The only solution to the chicken and egg problem is to continue using and deploying REST APIs,
and continue documenting our journey doing so. Whenever we can, we should start creating tools to
help REST developers. Each small step will help create momentum.

# Footnotes

<sup><a id="fn.1" href="#fnr.1">1</a></sup> [Building Hypermedia APIs with
HTML](https://www.infoq.com/presentations/web-api-html/), by Jon Moore at QCon 2013