Skip to content

Instantly share code, notes, and snippets.

@rivol
Created July 27, 2018 08:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rivol/a4d4ad9292221718cf1675399dcebb9f to your computer and use it in GitHub Desktop.
Save rivol/a4d4ad9292221718cf1675399dcebb9f to your computer and use it in GitHub Desktop.
EP18 talk
<!DOCTYPE html>
<html>
<head>
<title>Creating Solid APIs</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<link rel="stylesheet" href="css/font-awesome.min.css">
<style type="text/css">
@import url(https://fonts.googleapis.com/css?family=Ubuntu+Mono:400,700,400italic|Ubuntu:400,500,700);
@import url(https://fonts.googleapis.com/css?family=Open+Sans:400,600,700,300,400italic,300italic|Lato:400,700,300,100,900);
/*
#273040
#ef6036
*/
body {
font-family: 'Open Sans';
}
h1, h2, h3 {
font-family: 'Lato';
font-weight: 700;
}
em { font-weight: 300; }
a, a:visited {
color: #bbbbdd;
}
.muted { color: #888; }
.float.left { float: left; }
.float.right { float: right; }
.remark-notes-area .remark-notes {
font-size: 24px;
}
/* General slide styles */
.remark-container { background: #000; }
.remark-slide-scaler { box-shadow: none; }
.remark-slide-content {
font-size: 52px;
color: #ccc;
padding: 0.5em 1.5em;
background: #000;
background-size: cover;
}
.remark-slide-content {background: #000;}
.remark-slide-content .small { font-size: 85%; }
.remark-slide-content h1 {
margin: 0 0 0.5em;
color: #fff;
display: inline-block;
border-bottom: 3px solid #ef6036;
}
.remark-slide-content h1 { font-size: 70px; }
.remark-slide-content h2 { font-size: 40px; }
.remark-slide-content h3 { font-size: 28px; }
.remark-slide-content ul { margin: 0.5em 0; }
.remark-slide-content ul > li, .remark-slide-content ol > li { margin-bottom: 18px; }
.remark-slide-content li > ul { margin-top: 18px; }
.remark-slide-content p { margin: 1em 0 0.5em; }
.remark-slide-content h1 + p { margin-top: 0; }
.remark-slide-content h1 + ul { margin-top: 0; }
.remark-slide-content > p:first-child { margin-top: 0; }
.remark-slide-content li p { margin-top: 0.5em; }
.remark-slide-content ul + p { margin-top: 1em; }
.remark-slide-content pre { margin: 0.3em 0; }
/* Slide variations */
/* Subsection heading */
.remark-slide-content.middle h1 {
font-size: 72px;
border-bottom: none;
line-height: 1.3;
}
.remark-slide-content.small {
font-size: 34px;
}
/* Large slide - usually for huge section titles */
.remark-slide-content.large {
font-size: 48px;
background: #000;
color: #f27d5a;
}
.remark-slide-content.large h1 {
font-size: 96px;
color: #f27d5a;
}
/* The very first title/cover slide */
.remark-slide-content.cover {
text-shadow: none;
position: relative;
padding: 1em 2em;
}
.remark-slide-content.cover h1 {
margin: 1.5em 0 2em;
font-size: 80px;
line-height: 1.3;
border-bottom: none;
}
.remark-slide-content.cover h1 small {
font-size: 70%;
}
.remark-slide-content.cover p {
margin: 0 0 0.5em;
color: #bbb;
font-weight: 300;
}
.remark-slide-content.cover .credentials {
position: absolute;
bottom: 1em;
left: 0;
right: 0;
}
/* Optional subtitle */
.remark-slide-content.cover .subtitle {
display: block;
height: 25px;
font-size: 32px;
font-weight: 400;
margin-bottom: 4em;
}
.remark-slide-content.creds {
color: #f27d5a;
}
.remark-slide-content.creds, .remark-slide-content.creds a {
color: #f27d5a;
}
.remark-slide-content.creds a {
text-decoration: none;
border-bottom: 1px solid rgba(242, 125, 90, 0.75);
}
.remark-slide-content.creds h1 {
text-align: center;
}
.remark-slide-content.creds .contacts {
margin: 2em 0;
}
.remark-slide-content.creds .project-link {
font-size: 32px;
}
/* Image slide - use with background: property */
.remark-slide-content.image {
background-size: contain;
background-position: center;
background-repeat: no-repeat;
}
/* Element styles */
.remark-code { font-size: 32px; }
.remark-code, .remark-inline-code {
font-family: 'Ubuntu Mono';
border-radius: 0.2em;
margin: 0 0.1em;
}
.remark-inline-code {
background-color: #282a36;
padding: 0 0.3em;
}
.remark-code, .hljs-default .hljs {
display: block;
padding: 0.5em 1.0em;
background-color: rgba(255, 245, 230, 0.25);
border-left: 16px solid #ef6036;
}
.hljs-dracula .hljs {
padding-left: 1em;
background: #17181F;
}
.code-small .remark-code {
font-size: 26px;
}
.code-small h1 + pre > .remark-code {
margin-top: -2em;
}
.code-tiny .remark-code {
font-size: 20px;
}
.code-tiny h1 + pre > .remark-code {
margin-top: -4em;
}
.remark-slide-number {
font-size: 32px;
display: none;
}
.remark-notes-area {
background: #222;
color: #ddd;
}
.remark-notes {
font-size: 28px;
line-height: 1.5;
}
.remark-notes-preview {
font-size: 20px;
}
.remark-notes-area .remark-bottom-area .remark-notes-current-area {
height: 80%;
}
.remark-slide-content img {
width: 100%;
}
strong {
font-weight: 600;
}
@page {
size: 1210px 681px;
margin: 0;
}
@media print {
.remark-slide-scaler {
width: 100% !important;
height: 100% !important;
transform: scale(1) !important;
top: 0 !important;
left: 0 !important;
}
}
</style>
</head>
<body>
<textarea id="source">
class: center, cover
# Creating Solid APIs
EuroPython 2018
Rivo Laks *⋅* 2018-07-27
???
Thank you, and Good morning!
I'm Rivo and this is my talk on how to create Solid APIs.
Our applications are increasingly being used not by humans but by other applications - via their APIs. "APIs are eating the world" they say.
But ironically, APIs themselves are first used by humans - other developers integrating with your app.
That means that your APIs must target not just machines, but even more importantly humans as well.
(They must be easy to get started with, intuitive to use, and frictionless.)
---
class: center, middle, large
# Background
???
I come from Estonia and until about a month ago I was part of Thorgate - an Estonian product development agency, working on variety of projects built with Python, Django and related technologies.
This talk was inspired by one of the projects there where API had top priority from day one. The focus was on creating a platform for managing forestry-related data. Other developers had to be able to interface with it, fetch data and do various operations. The UI also had to be built on top of the publicly available API.
As most developers, I had used and even created APIs before, but this project had higher demands and it got me thinking - how should I approach this?
What are the guidelines, the best practices, the tips and tricks to making APIs that other developers would love to use.
This talk aims to share what I found.
---
# What is an API?
???
API is usually defined as application programming interface.
A better definition however might be "application programmer interface".
API is really a user interface for other developers.
--
- _application programming interface_
--
- _application **programmer** interface_
--
- API is **user interface** for developers
---
# What Makes an API **Good**?
???
Let's think about how to make that user interface good.
quick overview of points, details in next slides
- documentation must exist and be helpful
- standards help bring familiarity and get started faster
- make sure the user has to deal with as little issues (friction) as possible
mention that web APIs are target but much is applicable to packages/libraries/codebases in general.
--
- Documentation
--
- Familiarity
--
- Lack of friction
---
# Documentation
- Often overlooked
???
Documentation is too often overlooked. We don't want to do this, it's not the fun part. At least when you're developing.
But when you're trying to make sense of something created by others, then documentation becomes MUCH more valuable.
Docs are often also the first point of contact people have with your API. Thus they might decide whether someone sticks with your offering or keeps looking. I do the same myself when looking for packages that solve some common problem. There are usually variety of options available and first impression of readme and docs is quite important deciding factor.
Now, docs do take effort. And I admit I'm not very good at doing them myself. But that doesn't mean we shouldn't try.
If you took only a single thing from my talk, let it be this -- put some effort into good docs.
--
- Gives the first impression
--
- The effort is worth it!
---
class: center, middle, large
# Creating <br> Awesome Docs
???
Sales page analogy - docs need to sell your API, convince potential users that this is the choice to go for.
Good news is, your audience is other developers, which makes things easier.
---
# What Should Go In There?
???
README should have purpose or 1-paragraph sales pitch.
If dev acount is required, offer a demo account / API key, so that the users can immediately try out a few queries, without having to sign up.
Root URL - mention HTTPS
--
- How do I access it?
--
- Do I need developer account?
--
- Root URL, etc
--
- Authentication info
---
# What Should Go In There?
- General encodings, formats, etc
???
Often the mundane gets overlooked. But important for those with less experience with APIs.
Listing common errors (& solutions) makes users more efficient.
--
- Pagination, versioning, etc
--
- Common errors
--
- **Code for getting started**
---
# The Endpoints
???
[6:20]
--
- URL & operations
--
- Request/response data
--
- Optional parameters
--
- Permissions etc
---
# Keep it Fresh!
???
--
- Obsolete docs are the worst
--
- **Always autogenerate!**
--
- Usually code &raquo; schema &raquo; docs
---
# Schema & Autogeneration
--
- OpenAPI, Swagger, etc
--
- Use your tools
--
- Combine docs & code examples
--
- Client libs autogeneration
---
class: center, middle, large
# Standardize!
???
[11:00] Let's talk about standardization and why it matters.
Following standards is good because it gives your users a sense of familiarity.
If you create something that's hand-crafted and completely unique, then your users will have to learn everything about it.
If instead you follow widespread standard, then it's likely that they already have experience with something that uses the same or similar standard, and can transfer this knowledge to your API.
Just as importantly, standards usually have some thought put into them, and that helps you avoid common pitfalls. This is similar to how frameworks make decisions for you, such as how to store passwords, and end up helping you avoid common, but often major, issues.
(They also help avoiding bikeshedding - pointless arguments about what's the best way to do something.)
---
# JSON API
http://jsonapi.org/
_one potential standard to use_
???
When it comes to APIs, my current standard of choice is JSON API.
JSON API is not just an API that uses JSON for its responses, but an actual specification for building APIs. It was created by authors of Ember and offers a comprehensive solution to building efficient APIs.
I should stress that it is one option of several, and there are others that accomplish the same goals, such as GraphQL.
You should choose based on your specific project and its needs.
--
GraphQL _is another option_
---
# Standardize Structure
???
One of the most important aspects of JSON API is that it defines a generic yet flexible structure for API responses.
Let's look at how JSON API responses are structured
As the example project, I'm using a project management tool that lets users define projects, epics and stories, and comment them, kind of similar to Basecamp.
--
Responses have predictable, familiar structure
---
class: code-small
```http
GET https://example.com/api/v1/projects
```
```json
{
"links": {
"next": "https://example.com/api/v1/projects?cursor=cD0yMDE4L",
"prev": null
},
"data": [...],
"included": [...]
}
```
???
Make request to the projects' listing.
The response document has three top-level members - links, data, and included. Let's look at them one-by-one.
The links are important because they enable discovery of related endpoints. In the same way, the API root should offer links to resource pages.
!! do NOT go into different pagination styles !!
---
class: code-small
```json
"data": [
{
"type": "project",
"id": "289",
"links": {
"self": "https://example.com/api/v1/projects/289"
},
"attributes": {
"created": "2018-06-28T22:52:08.690486Z",
"name": "Allison-Patterson",
"description": "aggregate collaborative models"
},
"relationships": {...}
},
...
],
```
???
Next up is `data` which contains the so-called 'primary data' - the resource or resources you asked for.
In this case we asked for a list of projects, so the data is a list too - list of resource objects to be exact.
Each resource has type and id, which uniquely identify it.
Resources can also include links. In our case, every resource will contain its own canonical link. You can use that to operate on this object specifically, for example to modify it, or just retrieve the full details of a single object.
Next up is attributes, which is quite self-explanatory. In this case, we get project's creation timestamp, which btw uses standard ISO format, as well as project's name and description.
Resources can also have related objects, returned under `relationships` object.
---
class: code-small
```json
"data": [{ ...
"relationships": {
"created_by": {
"data": {
"type": "user",
"id": "199"
}
},
"epics": {
"data": [
{
"type": "epic",
"id": "3101"
}
],
}
} }, ...
],
```
???
Each related object has type and id. These uniquely identify the object and allow us to look it up in included resources.
---
class: code-small
```json
"included": [
{
"type": "epic",
"id": "3101",
"attributes": {
"created": "2018-06-28T22:50:45.885691Z",
"name": "Ergonomic background extranet"
},
"links": {
"self": "https://example.com/api/v1/epics/3101"
}
},
{
"type": "user",
"id": "199",
"attributes": {...}
}
]
```
???
Included resources are similar to the primary data.
They can be used to optimize the number of requests your app does. Example.
---
class: center, middle, large
# Impressions?
???
[17:00] How did that make you feel?
If you haven't used JSON API before, didn't know anything about it at all, then it might have looked weird.
It might have looked bloated - if you wanted to receive the name of user who created a project, you'd have to jump through several layers of indirection.
And yet, if I now gave you response for another object from that same API, told you that it has an 'updated_by' field which is a user and wanted to retrieve the email of that user, you would know exactly how to do that. Because the data would be structured in exactly the same way.
Furthermore, if I gave you a different API, completely unrelated, and told that it also uses JSON API, you would know how to access that one as well.
And that is the power of standardization. It brings familiarity, makes concepts that you already know applicable to something new.
TODO: maybe break this into two:
- initial Impressions? question
- and Power of Standardization explainer or smth
---
# Flexibility
Configurable fields:
```http
GET https://example.com/api/v1/projects
GET https://example.com/api/v1/projects \
?included=comments
GET https://example.com/api/v1/projects \
?included=comments&fields[project]=name,comments
```
---
class: code-small
# Pagination
List responses have `next` / `prev` links
```json
{
"links": {
"next": "https://example.com/api/v1/projects?cursor=cD0yMDE4L",
"prev": null
},
"data": [...],
}
```
--
Cursor-based pagination FTW (but YMMV).
???
Sometimes page-number-based makes more sense
---
# There is More ...
- Filtering
- Ordering
???
`filter=...` impl. dependant
`?sort=age,name`
---
# Errors
???
Errors happen.
Make it easy for the users to figure out why something happened and how they can fix the problem.
Once again, the goal is to reduce friction, making errors faster and easier to solve.
--
```http
POST https://example.com/api/v1/projects
```
```json
{
"errors": [
{
"title": "Invalid Attribute",
"detail": "Name must contain at least three letters.",
"source": { "pointer": "/data/attributes/name" },
"status": "422"
}
]
}
```
---
# Special Cases
For when you have LOTS of data
???
[22:00] Make a section heading?
There are special cases, e.g. when a large amount of data needs to be transmitted.
In those cases a different, more specialized format might make more sense.
(note though that JSON is compressed and thus the visible bloatyness is actually less of a problem)
Even better solution might be including link to downloadable data in your main API response. JSON API already has `links` object that could contain urls to various secondary data download points.
"out-of-band"
--
- _out-of-band_ approach
---
class: code-small
```http
GET https://example.com/api/v1/datasets/123
```
```json
{
"data": {
"type": "dataset",
"id": "123",
"attributes": {
"name": "CIFAR10 dataset",
},
"links": {
"data_tgz": "https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz",
"self": "https://example.com/api/v1/datasets/123"
}
}
}
```
???
Even better solution might be including link to downloadable data in your main API response. JSON API already has `links` object that could contain urls to various secondary data download points.
"out-of-band"
Additional benefit of being able to store your data outside your app server, eg in S3, and just link to it.
---
# Standardization Matters
- the specific standard isn't that important
- GraphQL, etc are also good options
---
class: center, middle, large
# Authentication &<br>Authorization
???
Most APIs don't deal only with public data, or even if they do, they want to be able to identify clients for various purposes, such as request limits.
So you need to think about authentication - how to identify who's making the requests. Plus authorization - what is the user allowed to access.
The best practice here largely depends on the use case. Covering two major options.
- token authentication
- or, OAuth2
Django OAuth Toolkit for DRF
---
# Token Authentication
Clients send HTTP header, ala
```http
Authorization: Token 9944b09199c62bcf9418
```
???
The simple approach is token authentication, where clients send an HTTP header containing a simple token with each request.
Token auth is useful for client-server situations where client is for example a native mobile application that the user directly logs into. In that case the mobile app could receive the token after user has successfully logged in.
Session cookies are basically one kind of tokens. If your API is only accessed from browsers, you might not need anything else.
---
# OAuth 2.0
???
For more complicated situations, theres OAuth2.
OAuth is meant for creating platforms. Think Facebook, where a 3rd party app can request access to user's data, the platform then verifies this request, asks for user's permissions, and grants the app a token which is both app-specific as well as user-specific.
OAuth2 is quite complex protocol, covering different usecases and different flows, such as mobile apps, web apps, plus apps with very limited UI such as living room devices.
This is good because once again, you'll be using proven standards that have evolved over the years and solved many problems you might not think about. But the downside is that it requires quite a lot of attention when implementing.
Luckily there are various libraries available that take of most of the plumbing. If you're using Django, there's Django OAuth Toolkit, if not, there's the OAuthLib that the Django OAuth Toolkit itself is built upon.
For our API project, we had to add some functionality on top of Django OAuth Toolkit. Some of this was due to missing features of DOT, for example better redirect URI validation. Most however was due to the requirements of the project itself, such as need to pass around more info related to the tokens. We also reimplemented the pages where developers can register applications, to make them more user friendly.
--
- For creating platforms
--
- Complex, but solves many issues
--
- Many packages, e.g. _Django OAuth Toolkit_, _OAuthLib_
---
class: center, middle, large
# Versioning
???
[28:30] Let's talk about versioning.
Most importantly, you should think about versioning from day one.
It's difficult to bolt it on later, because people will be using your API with expectations that it will never change. Because you didn't give them any indication of otherwise.
If you have versioning from the beginning, it will be easier to manage these expectations, but you should also make it clear how long old versions will be supported and how developers can find out about these support schedules.
---
# Versioning Schemes
???
How to specify versions in API requests? There are different options here, I'll cover two most popular ones.
You can have clients specify version as part of `Accept` HTTP header. [example]. Header-base approach is more idealistic, since the version used is sort of meta information, which is then kept out of the url paths.
But headers are a bit harder to use and test, eg you cannot easily test different versions with browser. So, in the real world, path based versioning might be a more pragmatic choice. Can make debugging easier as well, if server logs contain path you get version as well.
--
- `AcceptHeaderVersioning` _(DRF)_
```http
GET /projects HTTP/1.0
Accept: application/json; version=1.0
```
--
- `URLPathVersioning` _(DRF)_
```http
GET /v1/projects HTTP/1.0
Accept: application/json
```
---
# Versioning Schemes Cont.
- Integers (`v1`) vs dates (`2018-07-27`)?
???
Integers vs dates, similar to CalVer.
Remove as much friction from upgrading as possible.
This means changelogs, easy possibility to use different versions in the same client, etc.
--
- Dates are less emotional
--
- Make upgrades easy
---
# Version Transformers
???
Two broad categories - incremental, or breaking.
For incremental changes, a nice approach is version transformers.
Similar to Django's middlewares.
Your core code would then support only the newest version, and older versions are handled with transformers. Each transformer knows how to translate incoming requests from older version to a newer version, and how to translate responses from newer version back to older ones. This makes it quite easy to e.g. change field names or add new fields without affecting old versions.
Notably, Stripe is using that approach in their API.
For massive breaking changes, transformers might not work out.
In that case, you may need to just create a completely new API implementation, duplicating some code in the process.
--
- **&raquo;** **&raquo;** Requests into **newer** version **&raquo;** **&raquo;**
- Core code is for **latest** version
- **&laquo;** **&laquo;** Responses into **older** version **&laquo;** **&laquo;**
--
- Won't work for big, breaking changes
---
class: center, middle, large
# Client's Perspective
???
Now, let's look at the same thing from the other perspective - that of a client trying to *use* an API.
---
# The Scenario
- Let's try speech recognition...
???
Let's say I have some audio and I want to do speech recognition on it.
Using GCP and AWS as examples.
Both good examples because both provide a SINGLE library providing access to ALL services, in a unified and familiar way.
--
- ... using AWS and GCP
---
# Getting Started
- Documentation
--
- Quite easy to find
- A bit overwhelming
--
- Code examples
---
# Comprehensive Clients
- Install Python client
- GCP: `google-cloud` package
- AWS: `boto3` package
--
- Authenticate
--
- Thorough docs
---
- Amazon
```python
import boto3
client = boto3.client('transcribe')
response = client.start_transcription_job(...)
```
--
- Google
```python
from google.cloud import speech
client = speech.SpeechClient()
results = client.recognize(...)
```
???
Takeaways:
- both SDKs provide a common interface to all included services - ala s3, ec2, transcribe
- both SDKs are also somewhat similar to each other, using common and familiar patterns
- docs are thorough, getting started, code examples
- both are at least partially autogenerated
- this can benefit you as well!
---
# In Summary
- Invest in documentation
- Embrace standards _(e.g. JSON API)_
- Use automation
- Reduce friction
???
docs matter, first impression, etc. standards bring familiarity. automation for docs and client libs
in general, reduce friction as much as possible. friction is all the little bumps and issues that drive people away. think of the humans!
---
class: creds
# Thanks!
.contacts.center[
Rivo Laks *⋅* @rivolaks *⋅* [rivolaks.com](https://rivolaks.com)
[tinyurl.com/ep18api](https://tinyurl.com/ep18api)
]
???
What I talked about is MOST important for big projects, where API is an important platform, used by many external developers throughout years.
But all of it is also applicable to smaller products, and leads to maintainable code in general.
- [How to make a good library API — Flávio Juvenal](https://youtu.be/4mkFfce46zE)
- [OAuth 2.0 Servers](https://www.oauth.com/)
- https://stripe.com/blog/api-versioning
</textarea>
<script src="https://gnab.github.io/remark/downloads/remark-latest.min.js">
</script>
<script type="text/javascript">
var slideshow = remark.create({
ratio: '16:9',
navigation: {
scroll: false,
touch: true,
click: true
},
highlightStyle: 'dracula',
});
</script>
</body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment