brodygov/api-authentication.md

## api-authentication.md

      
    Raw
  

              api-authentication.md
            
          
    Thoughts on API Authentication Strategies

There are a number of different strategies for enabling API authentication for system-to-system authentication between two parties. All of them have some advantages and disadvantages.
Simple API key

The simplest approach is typically to pass a secret API key as a header or using HTTP basic auth. The client provides a secret value in the Authorization or Bearer header. The server matches the key against a stored value for that account. This relies on the security of HTTPS / TLS to provide confidentiality and integrity. This approach excels for websites with a lot of end-users who need to be able to manage their own keys through a web interface or API. It's so simple that clients don't need any custom code.
Pros:


Very simple to implement for both clients and servers

Cons:


If TLS is broken, security is completely lost (leads to loss of confidentiality, integrity, replay resistance)
Server must store secret key in plaintext (can be avoided with a variant that uses two keys: a non-secret key ID and a secret key)

Example APIs:

Stripe API

https://stripe.com/docs/api#authentication
Twilio API

https://www.twilio.com/docs/api/rest/keys
Made up example

# Request will be sent with header Authorization: Basic YWNjdF9pZDEyMzQ6RWkzVUlVT09kbFRRUDhLMQ==
curl -i -u acct_id1234:Ei3UIUOOdlTQP8K1 https://api.example.com/v1/list-things

OAuth flow

OAuth is widely used by consumer websites where an end-user is involved and wants to authorize a third-party application to access their account on the main website. This is especially helpful where you want to grant different permission levels for particular actions ("scopes" in OAuth parlance).
This tends to be overly complicated for server-to-server authentication where no end-user is involved.
Pros:


Great for 3-legged flows (end-user authorizes app to access their account)
Widely used, lots of library support in all major languages
There is often built-in support for rotating secret keys (request a new API key using a refresh token)

Cons:


Overly complicated for server-to-server auth
Complex enough that it's hard to wrap your head around how it works with all the edge cases
If TLS is broken, security is completely lost (leads to loss of confidentiality, integrity, replay resistance)

Example APIs

Twitter API

https://developer.twitter.com/en/docs/basics/authentication/overview/oauth
Google+ API

https://developers.google.com/+/web/api/rest/oauth
Facebook API

https://developers.facebook.com/docs/facebook-login/access-tokens/
TLS client authentication

TLS client authentication authenticates both client and server using TLS certificates. Because of the complexity of issuing certificates and managing certificate authority lists, this tends to be used only for internal service-to-service authentication within an organization. U.S. federal executive branch websites also sometimes use PIV cards for user-to-website TLS client authentication.
If keys are compromised, this approach usually requires both parties to agree to change the certificates and have a flag day to switch over. As a result, I would only recommend using TLS client authentication in cases where both client and server are a part of the same organization, which makes it easier to do coordinated deploys.
Pros:


very high security

Cons:


difficult to rotate keys
complicated to implement on the server side
publicly trusted certificates cost money
private CA certificate issuance is a complicated process that is easy to get wrong
must manage CRLs for revocation, deploy to all authentication points

Example APIs

I couldn't find any major public APIs that do this in general. There are a couple that use it in very specific contexts:
https://blog.cloudflare.com/introducing-tls-client-auth/
https://docs.aws.amazon.com/apigateway/latest/developerguide/getting-started-client-side-ssl-authentication.html
https://developer.okta.com/blog/2015/12/02/tls-client-authentication-for-services
Federal government sites (user-to-service):
https://login.max.gov
https://login.gov
HMAC signature

There are some more sophisticated schemes that use a secret API key to sign message content rather than sending the raw secret key in the message.
Pros:


Even if TLS is broken, only confidentiality of individual message is lost. Secret key is not disclosed, and integrity is still guaranteed.
Message integrity can be verified by an end service even if an intermediate load balancer terminates TLS.
In a large environment with many services, can use a key service and derive per-service signing keys to limit key exposure. (e.g. AWS derives a key per-service, per-data-center, per-day from the original secret key. Each service only needs the derived key to verify messages.)

Cons:


Requires significant custom code to implement signing on clients and to verify signatures on server.
If done wrong, ad hoc schemes can create major vulnerabilities

Flickr used a custom MD5 scheme instead of secure HMAC, which allowed forgery attacks: https://dl.packetstormsecurity.net/0909-advisories/flickr_api_signature_forgery.pdf


Server must store secret key in plaintext.

Example APIs:

Amazon Web Services API

https://docs.aws.amazon.com/general/latest/gr/signature-version-4.html
Amazon S3 API

https://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthentication.html
User Password (not really an API)

Some services may not have been designed with programmatic access in mind. For these systems that have no API, it's common to simply create a "user" account that is actually used by a machine. This can be thought of as a degenerate case of "Simple API key" above, with worse security in most respects.
Pros:


Very simple to implement for both clients and servers
Works even for services never intended for automated use

Cons:


Very difficult to rotate credentials without downtime (requires multiple user accounts to perform gracefully)
Password reset flows may compromise security. Most services allow performing a password reset with only access to the email address on file.
Must ensure password is generated with high entropy, since the service does not generate it.
Difficult to properly scope permissions. Usually a user account has full control over changing important details of the account, such as the email address on file, the password, or other settings. These are not permissions that you would normally want to grant to a machine client.
If TLS is broken, security is completely lost (leads to loss of confidentiality, integrity, replay resistance)
Shoehorning automated access into a user account may lead to other problems.

Example APIs:


Active Directory service accounts
Database machine users
Github "machine accounts". While Github accounts are normally intended for humans, Github also documents their use for automated purposes. https://help.github.com/articles/github-terms-of-service/#3-account-requirements

Additional measures for reducing risk

Putting IP address whitelists in firewalls and using VPN connections can reduce risk of running connections over the Internet and mitigate risk in case there are vulnerabilities or errors in one of the authentication mechanisms.
Securely exchanging the secret keys is often a challenge in many of these mechanisms. Usually you want to have the server generate a fully random API secret token and deliver it to the user only once. The server should never reveal the secret key after initial delivery.
It's also important to be able to have multiple API keys in rotation, each valid for a specified duration. For example, the AWS API allows you to mark a specific API key inactive before deleting it so that you can confirm that it isn't being used by any clients in a non-destructive way.
Also, always add a version number to any API! Amazon has a very sophisticated and secure API signing scheme today, but it took them 4 major iterations to get there.