nicolasdao/oauth_openid_guide.md

## oauth_openid_guide.md

      
    Raw
  

              oauth_openid_guide.md
            
          
    OAUTH 2.0. & OPENID CONNECT GUIDE

Table of contents


Why is it so confusing?

OAuth 2.0 is often mixed in the login/signup flows though it cannot create account
Many popular IdPs implement OAuth 2.0 using OpenID conventions, though they are not OpenID compliant
Why pseudo-authentication is ok and not ok


OAuth 2.0 vs OpenID Connect

OpenID Connect

Identity claims
Discoverability


HTTP(S) endpoints
Grant type flows

Authorization code flow

Authorization code use case
Proof Key For Code Exchange(PKCE)


Implicit flow
Client credentials flow
Password flow
Refresh token flow
Device code flow


Tokens

Token specs

code specs
access_token specs
id_token specs
refresh_token specs


JWT structure and encoding/decoding
JWKS - JSON Web Key Sets


FAQ

How to decode a JWT?


Annexes

WebAuthn


References


Why is it so confusing?

OAuth 2.0 is often mixed in the login/signup flows though it cannot create account

If your API does not need to be used by third-parties (i.e., you're building a private web API to build your own apps), then you don't really need OAuth. OAuth 2.0 is also not required when you wish to let your users login or signup using Facebook, Google or any other identity provider. In that case, the only requirement for your platform is to support interaction with the identity provider's Authorization Server (which most likely implements OAuth2, and maybe even OpenID Connect). OAuth2 has nothing to do with using a bearer JWT in the Authorization header of HTTP requests, though that's what OAuth2 flows use to transport tokens. It's not because you have an Authorization Server that your API is OAuth2 compliant.
Many popular IdPs implement OAuth 2.0 using OpenID conventions, though they are not OpenID compliant

OpenID solves the lack of authentication standard in OAuth 2.0. It introduces the concept of id_token on top of the existing OAuth 2.0. access_token and refresh_token. That id_token contains standardized claims about the user's identity. The required claims are:

iss: Issuer Identifier for the Issuer of the response. The iss value is a case sensitive URL using the https scheme that contains scheme, host, and optionally, port number and path components and no query or fragment components.
sub: Subject Identifier. That the ID of the token owner (i.e., the ID you use in your own system to do a user lookup). It must not exceed 255 ASCII characters in length. The sub value is a case sensitive string.
aud: Audience(s) that this ID Token is intended for. It MUST contain the OAuth 2.0 client_id of the Relying Party as an audience value. It MAY also contain identifiers for other audiences. In the general case, the aud value is an array of case sensitive strings. In the common special case when there is one audience, the aud value MAY be a single case sensitive string.
exp: Expiration time on or after which the ID Token MUST NOT be accepted for processing. The processing of this parameter requires that the current date/time MUST be before the expiration date/time listed in the value. Implementers MAY provide for some small leeway, usually no more than a few minutes, to account for clock skew. Its value is a JSON number representing the number of seconds from 1970-01-01T0:0:0Z as measured in UTC until the date/time. See RFC 3339 [RFC3339] for details regarding date/times in general and UTC in particular.
iat: Time at which the JWT was issued. Its value is a JSON number representing the number of seconds from 1970-01-01T0:0:0Z as measured in UTC until the date/time.

Those claims have become defacto claims in many OAuth 2.0. API implementation and allows the access_token to transport a small identity footprint which allows some level of basic authentication (often referred as pseudo-authentication).
Why pseudo-authentication is ok and not ok

As mentioned earlier, OAuth 2.0. is designed for authorization, not authentication. The standard that supports both authentication and authorization is OpenID Connect (OIDC). However, many developers are authenticating users to their systems with OAuth 2.0. access_tokens coming from Facebook, GitHub, or LinkedIn. Those APIs are not OIDC compliant. This method is called pseudo-authentication. In its simplest form, pseudo-authentication assumes that a non-tampered, not-expired access_token from Facebook is good enough to consider the user logged in to developer's app. However, this exposes a series of security vulnerabilities. For example, if both App_A and App_B's users use Facebook to log in, this means that App_B's users can use their access_token to login to App_A.
So why do we still use pseudo-authentication? Because nowadays, most Oauth 2.0. implementations include OpenID claims in their access_token claims:

iss: That's the Issuer Identifier. For example: 'https://accounts.google.com'.
sub: Subject Identifier. For example: '12345'
aud: Audience(s) that this token is intended for. For example: 'https://app_a.com'

In the example above, on top of verifying that the access_token has not been tampered and is not expired, App_A must also check that the iss, sub and aud are allowed. If an App_B user uses their access_token (which contains an aud claim set to 'https://app_b.com') in App_A, App_A will notice a forbidden aud claim and will prevent the API access.
To conclude, if the access_token contains enough contextual claims and if the right amount of access_token checks are implemented in each apps, then using pseudo-authentication is perfectly fine.
OAuth 2.0 vs OpenID Connect

OAuth 2.0 is about authorization. It delegates your API access to a third party system with control on what it is authorized to do or not. It can be used for pseudo-authentication, i.e., the access_token contains claims about the identity of the user (e.g., an id or email address). In most cases, devs wish to allow their systems to support login/signup via an Identity Provider (e.g., Google, Facebook) using pseudo-authentication. Supporting this feature does not make their system OAuth 2.0 compliant. This just means they can integrate with an external OAuth 2.0 compliant API. In fact, there are no OAuth2 flows to create users. OAuth2 can only allow existing users/accounts (called resource owners) to create tokens that allow to access their resources via the system's APIs.
As for OpenID, it adds explicit authentication to OAuth 2.0. In a nutshell, this is an OAuth 2.0. superset that exposes an extra token called the id_token which contains explicit and standardized data regarding the resource owner's identity. In the case the OAuth 2.0. specification, there are no standard for identity claims in the access_token. When there is, this at the Identity Provider's own discretion. For example, as of 2021, Facebook, GitHub and LinkedIn use some of the OpenID's standard claims in their access_token when specific OAuth scopes are requested. This is not standard. Only Google uses OpenId as per the OIDC specification.
OpenID Connect

Identity claims

The required claims are:

iss: Issuer Identifier for the Issuer of the response. The iss value is a case sensitive URL using the https scheme that contains scheme, host, and optionally, port number and path components and no query or fragment components.
sub: Subject Identifier. That the ID of the token owner (i.e., the ID you use in your own system to do a user lookup). It must not exceed 255 ASCII characters in length. The sub value is a case sensitive string.
aud: Audience(s) that this ID Token is intended for. It MUST contain the OAuth 2.0 client_id of the Relying Party as an audience value. It MAY also contain identifiers for other audiences. In the general case, the aud value is an array of case sensitive strings. In the common special case when there is one audience, the aud value MAY be a single case sensitive string.
exp: Expiration time on or after which the ID Token MUST NOT be accepted for processing. The processing of this parameter requires that the current date/time MUST be before the expiration date/time listed in the value. Implementers MAY provide for some small leeway, usually no more than a few minutes, to account for clock skew. Its value is a JSON number representing the number of seconds from 1970-01-01T0:0:0Z as measured in UTC until the date/time. See RFC 3339 [RFC3339] for details regarding date/times in general and UTC in particular.
iat: Time at which the JWT was issued. Its value is a JSON number representing the number of seconds from 1970-01-01T0:0:0Z as measured in UTC until the date/time.

Discoverability

OpenID adds discoverability to OAuth 2.0.
Google example: https://accounts.google.com/.well-known/openid-configuration
HTTP(S) endpoints


Pathname
Method
Type
Description


/token
POST
OAuth 2.0
Gets one or many bearer tokens (e.g., access_token, refresh_token, id_token).


/revoke
POST
OAuth 2.0
Revokes a refresh_token.


/authorize
GET
OAuth 2.0
Redirects to your platform's consent page to prompt user to authorize a third-party to access their resources.


/introspect
POST
OAuth 2.0
Introspects a token (e.g., access_token, refresh_token, id_token).


/userinfo
GET
OAuth 2.0
Returns user's profile based on the claims associated with the access_token.


/certs
GET
OAuth 2.0
Array of public JWK keys used to verify id_tokens.


/.well-known/openid-configuration
GET
OIDC
Discovery metadata JSON about OpenID web API only.


Grant type flows

Grant types describe specific flows for retrieving tokens via the /token endpoint. The official OAuth 2.0 grant types are:

authorization_code aka code
implicit(deprecated in favor of the authorization code flow)
client_credentials 
password
refresh_token
device_code

Authorization code flow

Authorization code use case

This flow exists because users jump from a signin page to a consent page on a different domain then back to another page(that most likely was not accessible before being logged in). All those jumps happen via HTTP redirects. The only way to pass secrets via HTTP redirects is via query parameters, which opens a series of attack vectors. To decrease the risk of attacks, instead of passing sensitive data (i.e., access_token or id_token) in the query parameters, a code is used. That code can then be exchanged for tokens via a safer HTTP POST.
Proof Key For Code Exchange(PKCE)

PKCE (pronounced pixy) is a security technique developed to protect against authorization code interception attacks.
Implicit flow

This flow was what used to be the defacto option before the Authorization code flow was created. It works similarly to the Authorization code flow except that the token are explicitly returned in the query parameters after the consent page.
Client credentials flow

With this grant type, a client_id and a client_secret are exchanged for an access_token and potentially an id_token if the scopes contain openid.
Password flow

With this grant type, a client_id, username and password are exchanged for an access_token and potentially an id_token if the scopes contain openid.
Refresh token flow

With this grant type, a refresh_token is exchanged for a new access_token and potentially a new id_token if the initial scopes contained openid.
Device code flow

Under construction...
Tokens

Token specs

The OAuth 2.0 and OIDC specification uses 4 tokens:

code
access_token
id_token
refresh_token

code specs

A code must:

Be protected against tampering(1).
Be very short-lived. OAuth2 recommends to set the duraction time between 30 and 60 seconds. The longest duration allowed by OAuth2 is 10 minutes.
Be associated with the context of the original request that created that code(2). That context must include at a minimum:

The client_id so only the same client ID can exchange that code for another articfact later.
The creation date or the expiry date so the code can be invalidated if it has expired (recommended duraction is 30 to 60 seconds, max. 10 minutes).
The scopes that this code can associate with a token. Certain flows(e.g., refresh token scenario when the Authorization Code flow is used) rely on those scopes.
The audience(s) that this code can associate with a token.


(1) It is entirely up to you to decide how to protect your code against tampering. Two suggestions are listed below(2).
(2) It is entirely up to you to decide how to persist that context between requests. Here are two suggestions on how to achieve this:

Encrypt the context in the code itself. Because this context is not meant to be shareable with other third-parties but you, a simple AES encryption should do. When the code comes back later, it can be validated to ensure it has not been tampered and the context can be retrieved.
Generate a your code as a simple unique identifier and store it with its context in a secured persistent storage. When it comes back later, that context can be extracted from that persistent storage.


access_token specs

An access_token must:

Be cryptographically signed on your server so you can validate that it has indeed been issued by you and that it has not been tampered when you get it back.
Be short-lived. It cannot be valid for more than an hour after being issued.
Be able to be associated with:

The client_id that made the original request.
The owner's identity, whether that owner is a user or a service account.
The scopes it was generated it from.
The audience(s) that can be accessed by this token(1).


Criteria 2 and 3 can be achieved by encoding this token using a JWT, but this is not required by the OAuth2 specification. If you decide to use JWT to implement your access tokens, the standard approach is to include some of the following standard OIDC claims:

iss: Issuer Identifier for the Issuer of the response. The iss value is a case sensitive URL using the https scheme that contains scheme, host, and optionally, port number and path components and no query or fragment components.
sub: Subject Identifier. That the ID of the token owner (i.e., the ID you use in your own system to do a user lookup). It must not exceed 255 ASCII characters in length. The sub value is a case sensitive string.
exp: Expiration time on or after which the ID Token MUST NOT be accepted for processing. The processing of this parameter requires that the current date/time MUST be before the expiration date/time listed in the value. Implementers MAY provide for some small leeway, usually no more than a few minutes, to account for clock skew. Its value is a JSON number representing the number of seconds from 1970-01-01T0:0:0Z as measured in UTC until the date/time. See RFC 3339 [RFC3339] for details regarding date/times in general and UTC in particular.
iat: Time at which the JWT was issued. Its value is a JSON number representing the number of seconds from 1970-01-01T0:0:0Z as measured in UTC until the date/time.
You should also include the scope, aud and client_id.


Even if you choose to not use a JWT, those claims above should be associated with the access token in one way or the other.


The sub must be unique per iss. You can use the iss + sub to uniquely identify users.

Here is an example from an Okta JWT access_token:
{
	"iss": "https://micah.okta.com/oauth2/aus2yrcz7aMrmDAKZ1t7",
	"sub": "okta_oidc_fun@okta.com",
	"exp": 1501531801,
	"iat": 1501528201,
	"scope": "openid email",
	"client_id": "0oa2yrbf35Vcbom491t7",
	"aud": "test",
	"token_type": "Bearer",
	"active": true,
	"username": "okta_oidc_fun@okta.com",
	"jti": "AT.upPJqU-Ism6Fwt5Fpl8AhNAdoUeuMsEgJ_VxJ3WJ1hk",
	"uid": "00u2yulup4eWbOttd1t7"
}
To learn more about the various strategies used to generate access tokens and maintain their state, please refer to this article called OAuth Access Token Implementation.

(1) Though the OAuth2 documentation specifies that the aud MUST contain the client_id of the resources that can accept it, most concrete implementation use URIs. For example, let's say that your token can only access the following two APIs: https://api.example.com and https://api.otherexample.com/somepath. In that case, the aud value associated with your token would be: "aud":"https://api.example.com https://api.otherexample.com".

id_token specs

An id_token must:

Be cryptographically signed on your server so you can validate that it has indeed been issued by you and that it has not been tampered when you get it back.
Be short-lived. It cannot be valid for more than an hour after being issued.
Be a JWT containing at a minimum the following claims (to be compliant to the OIDC specification):

iss: Issuer Identifier for the Issuer of the response. The iss value is a case sensitive URL using the https scheme that contains scheme, host, and optionally, port number and path components and no query or fragment components.
sub: Subject Identifier. That the ID of the token owner (i.e., the ID you use in your own system to do a user lookup). It must not exceed 255 ASCII characters in length. The sub value is a case sensitive string.
aud: Audience(s) that this ID Token is intended for. It MUST contain the OAuth 2.0 client_id of the Relying Party as an audience value. It MAY also contain identifiers for other audiences. In the general case, the aud value is an array of case sensitive strings. In the common special case when there is one audience, the aud value MAY be a single case sensitive string.
exp: Expiration time on or after which the ID Token MUST NOT be accepted for processing. The processing of this parameter requires that the current date/time MUST be before the expiration date/time listed in the value. Implementers MAY provide for some small leeway, usually no more than a few minutes, to account for clock skew. Its value is a JSON number representing the number of seconds from 1970-01-01T0:0:0Z as measured in UTC until the date/time. See RFC 3339 [RFC3339] for details regarding date/times in general and UTC in particular.
iat: Time at which the JWT was issued. Its value is a JSON number representing the number of seconds from 1970-01-01T0:0:0Z as measured in UTC until the date/time.


The sub must be unique per iss. You can use the iss + sub to uniquely identify users.

This JWT can also contain more reserved OIDC fields defined here. The optional other fields are referred as claims (e.g., name, family_name, given_name). OIDC defines a series of standard claims associated with each scopes. The number of claims contained in the id_token depends on the scopes passed to the request. You are not limited to the standard claims. You can create your own scopes and associate whatever custom claims to each scope.
Here is an example from an Okta JWT id_token:
{
	"iss": "https://micah.okta.com/oauth2/aus2yrcz7aMrmDAKZ1t7",
	"sub": "00u2yulup4eWbOttd1t7",
	"aud": "0oa2yrbf35Vcbom491t7",
	"exp": 1501535822,
	"iat": 1501532222,
	"name": "Okta OIDC Fun",
	"locale": "en-US",
	"email": "okta_oidc_fun@okta.com",
	"ver": 1,
	"jti": "ID.Zx8EclaZmhSckGHOCRzOci2OaduksmERymi9-ad7ML4",
	"amr": [
		"pwd"
	],
	"idp": "00o1zyyqo9bpRehCw1t7",
	"nonce": "c96fa468-ca1b-46f0-8974-546f23f9ee6f",
	"preferred_username": "okta_oidc_fun@okta.com",
	"given_name": "Okta OIDC",
	"family_name": "Fun",
	"zoneinfo": "America/Los_Angeles",
	"updated_at": 1499922371,
	"email_verified": true,
	"auth_time": 1501528157
}

(1) Though the OAuth2 documentation specifies that the aud MUST contain the client_id of the resources that can accept it, most concrete implementation use URIs. For example, let's say that your token can only access the following two APIs: https://api.example.com and https://api.otherexample.com/somepath. In that case, the aud value associated with your token would be: "aud":"https://api.example.com https://api.otherexample.com".

refresh_token specs

A refresh_token must:

Be protected against tampering(1).
Be long-lived. Theoretically, this token could live forever or you could add an expiry date far away from their creation date. It is entirely up to you to decide how long you want those tokens to exist. However, what you may want to support is the ability to revoke them.
Be associated with the context of the original request that created that refresh token(2). That context must include at a minimum:

The client_id so only the same client ID can exchange that refresh token for another articfact.
If the refresh token does not last forever, the creation date or the expiry date must be associated with the refresh token so it can be invalidated if it has expired.
The scopes that this refresh token was originally requested for. When the refresh token is used to acquire another token, those scopes need to be checked against the client_id to make sure they are still accessible.
The audience(s) that this refresh token was originally requested for. When the refresh token is used to acquire another token, this audience(s) need to be checked against the client_id to make sure they are still accessible.


(1) It is entirely up to you to decide how to protect your refresh token against tampering.
(2) It is entirely up to you to decide how to persist that context between requests. Because it is highly desirable to invalidate refresh tokens, a natural solution is to persist refresh tokens (incl. persisting their associated context) in your own database.

JWT structure and encoding/decoding

JWT are not encrypted by default. They are only base64 encoded and signed, which means that anybody can decode them to inspect their content. That's why they should not contain any secrets.
A JWT is structured as follow:
Header.Payload.Signature

/**
 * Decodes a JWT.
 * 
 * @param  {String} token		JWT string token
 * 
 * @return {Object} jwt
 * @return {Object} 	.header
 * @return {String} 		.kid
 * @return {String} 		.alg	e.g., 'RS256'
 * @return {Object} 	.payload
 * @return {String} 		.iss
 * @return {String} 		.sub
 * @return {String} 		.scope  e.g., 'phone openid profile email'
 * @return {String} 		.iss
 * @return {Number} 		.exp	Expiry epoc time. Unit seconds (e.g., 1634090736)
 * @return {Number} 		.iat	Issued at epoc time. Unit seconds (e.g., 1634087136)
 * @return {String} 	.signBase64
 */
const decodeJwt  = token => {
	if (!token)
		return null

	const [headerBase64, payloadBase64, signBase64] = token.split('.')

	return {
		header: base64ToJson(headerBase64),
		payload: base64ToJson(payloadBase64),
		signBase64
	}
}

const base64ToJson = b64 => {
	const base64 = b64.replace(/-/g, '+').replace(/_/g, '/')
	const jsonPayload = decodeURIComponent(
		atob(base64)
			.split('')
			.map(c => '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2))
			.join(''))

	return JSON.parse(jsonPayload)
}
JWKS - JSON Web Key Sets

https://auth0.com/docs/tokens/json-web-tokens/json-web-key-sets
NodeJS library that can convert a pem to JWK: pem-jwk
NodeJS library that can convert a JWK to pem: rsa-pem-from-mod-exp
Example of JWKS ourput for the various ciphers: https://connect2id.com/products/nimbus-jose-jwt/examples/jwk-generation
FAQ

How to decode a JWT?

Please refer to the JWT structure and encoding/decoding section.
Annexes

WebAuthn

https://webauthn.io/
References


OAuth 2.0
OAuth 2.0 scopes
OAuth 2.0 standard errors
OAuth 2.0 exhaustive errors
OAuth 2.0 access token implementation
OAuth 2.0 Google API Primer
OpenID Connect scopes
OpenID Connect claims
OpenID Connect scopes and their associated claims
OpenID Connect discovery metadata
OpenID Connect discovery metadata example
Okta - OpenID Connect API list
Okta - Scope dependent claims
Okta - Identity, Claims, & Tokens – An OpenID Connect Primer
Okta - Verify the token signature and the meaning of the standard claims
Auth0 - Authorization Code Flow with Proof Key for Code Exchange (PKCE)
Proof Key For Code Exchange(PKCE) specification
Pathname	Method	Type	Description
`/token`	`POST`	OAuth 2.0	Gets one or many bearer tokens (e.g., access_token, refresh_token, id_token).
`/revoke`	`POST`	OAuth 2.0	Revokes a refresh_token.
`/authorize`	`GET`	OAuth 2.0	Redirects to your platform's consent page to prompt user to authorize a third-party to access their resources.
`/introspect`	`POST`	OAuth 2.0	Introspects a token (e.g., access_token, refresh_token, id_token).
`/userinfo`	`GET`	OAuth 2.0	Returns user's profile based on the claims associated with the access_token.
`/certs`	`GET`	OAuth 2.0	Array of public JWK keys used to verify id_tokens.
`/.well-known/openid-configuration`	`GET`	OIDC	Discovery metadata JSON about OpenID web API only.