clemensv/tenant-bridging.md

## tenant-bridging.md

      
    Raw
  

              tenant-bridging.md
            
          
    Tenant Bridges

This document defines a mechanism to establish multiplexed, platform-level information flow between the tenant scopes of different, multitenant platform-as-a-service (PaaS) or software-as-a-service (SaaS) systems.
In particular, this specification introduces the concept of a tenant-bridging “channel” that is established as a communication link between the tenants of two cooperating platforms. While the mechanism defined here is quite simple, the solved problem requires quite a bit of context setting:
What are we solving?

The "cooperating platforms" are commonly from different cloud service vendors and are addressing different customer scenarios. Examples might be enterprise resource planning (ERP) on one side and general-purpose application hosting on the other. A scenario to enable might be for the ERP system to be easily extensible with event-driven serverless functionality hosted on the other platform.
In such a scenario, it will likely be the same customer organization who is a tenant-owner of the first vendor's ERP service and also a tenant-owner of the second vendor's serverless application host, and they will try to integrate their resources on either side across the platform boundaries. For example, they might want to run some custom code in the serverless application host that reacts to an event being raise about an object in the EPR system changing its state or being created.
A very significant challenge to overcome in such a scenario is that the user identity, user role assignments, and access control permission models are generally not aligned across the two platforms.
It is often impossible for someone acting under the ERP’s native user identity to interact with resources in the application hosting environment and vice versa. That makes it, for instance, impossible for a developer acting under an identity from the application hosting environment to subscribe to events raised from within the ERP environment even when there are well-known APIs like the CloudEvents Subscription API, because the subscriber’s identity is not simply understood in the other system.
While it is possible to build federation scenarios using OpenID Connect and OAuth 2.0 that allow for the same identity to be used across environments or for identity tokens to be traded across environments, these setups are enormously complex, and practically impossible in complex multitenant platforms as we will discuss shortly.
The authorization integration is getting even more complicated when cooperating platforms want to realize the information flow at the platform level underneath the tenancy abstractions and via multiplexed communication channels, where the platform elements need to act and be authorized under their own identities.
The goal of the tenant-bridging channels introduced here is to create a simple but sufficient authorization model and to eliminate all requirements for the identity and access control models of the cooperating platforms to be integrated across tenant scopes. The tenant-bridging model assumes a trusted-subsystem relationship between the platforms for the multiplexed data flows and establishes trust between the tenant environments on either side through an administrative handshake.
Multitenancy

In platform-as-a-service and software-as-a-service systems, multitenancy is a common approach to reduce operational cost and allow for instant allocation of resources to customers. Multitenancy means that shared underlying infrastructure resources are used for multiple customers – tenants –who are presented with an illusion of fully isolated, virtual environments.
This virtual environment isolation is commonly either scoped to a single service or, broader and spanning multiple services, to a whole platform:
First, the tenant’s resources for a particular service are commonly reflected by some named scoping concept. This might be a service instance like a virtual machine host or it might be a container construct (common names: account, namespace, server, cluster) into which the tenant can place further resources like queues or databases.
Second, and that mostly applies to larger cloud platforms with several services, there is a further, higher level scoping concept for organizing instances of several services under a single umbrella, which is typically related to a particular solution or some organizational unit of the tenant.
In systems that only have the first scoping model, the service-tenant and the resource are synonymous. In systems that also have the second scoping model, this platform-tenant will have several resources. There might also be multiple resources of the same kind, each reflecting a separate service-tenant.
To manage access to a resource, like allowing to send messages to a broker or allowing read access to a database, permissions are granted on the resource to a well-known identity. In simple cases, the identity is some account directly managed by and internal to the resource, but with larger platforms it will likely be an identity that is managed at the platform-tenant level, and which can be granted access to several resources across the platform.
Clients acting under that identity can then interact with those resources using a single-sign-on model, whereby the identity is established once through authentication and access to services is then granted based on permissions given to that identity.
The desire to manage access control in way that is consistent across a platform with multiple multi-tenant services practically leads to having closed authorization systems. Platform services are typically not configurable to trust multiple authorization services simultaneously: this could too easily lead to inconsistent access control rules that attackers could potentially exploit in combination. Those closed authorization systems do, in turn, commonly prefer the platform’s native identity management system.
While all involved service endpoints are commonly based on standards like OpenID Connect and OAuth 2.0 these days, the necessary constraints on the trust relationships between platform services and authorization services and between authorization services and identity services make it difficult to integrate external platform elements, especially if those are part of a similarly closed system.
Cooperating Multitenant Systems

Multitenancy presents a particular challenge when two multitenant systems are being interconnected and require a shared understanding about the mapping of tenant scopes of, for instance, a shared customer, while the detailed tenancy concepts on either side might differ significantly.
If an organization A is a customer of cooperating platforms P and Q and has tenant accounts in both, it appears reasonable for A to expect that someone in A’s organization, Alice, can create a serverless function in Q and for her to subscribe to events from a service in P and for those events be delivered into that function.
Unfortunately, the scenario can get impossibly complicated as we explore further details.

The delivery of events from P on behalf of A needs to be authorized at the subscribing endpoint in Q that runs on behalf of A, without either system having a common concept of who A is.
The developer or deployment engineer Alice needs to be authorized at P to subscribe to events, but her digital identity in the Q system is unknown at P and cannot be assigned permissions. To make matters worse, Alice might want to perform those operations through a platform portal, which means that those operations might need to be executed by the system on her behalf.

There are authorization and trust issues at discovery and subscription time as well as at delivery time of events. All of those operations need to cross the platform chasm and require an actor from one platform to perform an operation on the other.
Add to this that both cooperating platforms might each have thousands of tenants which need to be interconnected in this fashion.
A standards approach from the identity space to tackling this issue is the IETF RFC8693 token exchange, where security tokens from one system can be exchanged into tokens from another system.
That means Alice could use an identity token from Q and have that exchange into an identity token in P and then keep acting as herself in system P without having to reauthenticate. That does, however, require that Alice also has an identity in P for which the exchanged token can be issued.  And, of course, it’s not enough for there to be dual accounts, but all client code would needs to know about these exchanges and be instructed to perform them when needed. That is all excessively complicated.
Tenant-Bridging Channels

To simplify all this, we need a mapping model that allows for two such multitenant systems to allow for cross-system communication flow from a tenant scope of system P to a tenant scope of system Q while not having the access control and identity systems of either system integrate or interfere with the respective other, at all.
We achieve this by introducing the concept of a communication "channel" with two termini, whereby tenant access to one terminus is managed by the first system and tenant access to the other terminus is managed by the other system. The channel itself does not impose access control rules on either side, but rather entrusts the systems on either end with access control governance.
The channel can be unidirectional or bidirectional and may carry streams or datagrams.
To establish a channel, one of the systems – acting under its own identity – initiates creating a channel on the middleware that provides it. That middleware might be provided by either system P or system Q or a neutral party.
Either system will require some credential enabling interaction with the middleware. If the middleware is not neutral but provided by one of the systems – say P – there is an identity system touchpoint as system Q needs to perform that interaction under an identity understood by P, but that is purely a local concern and equivalent to the credential management required if the channel middleware were provided by a neutral party.
Channel creation handshake

Channel creation will typically be initiated by an end-user who has administrative control of the tenant scopes in both systems.
On the initiating system P, the authorized tenant administrator will initiate creating a channel and provide a tenant identifier understood by the other system Q.
The channel will then be created and initiating system's tenant can access the created channel and interact with it for communication. It is in the initiating system's responsibility how to surface channel access to its tenant and how to secure it.
The other system Q will be notified of the channel creation, but the channel will not be active and usable until it is accepted.
The accepting system will determine whether the supplied identifier matches a known tenant. If that is the case, the system will then propose the channel to the tenant for acceptance. By accepting the channel, the tenant's administrator permits communication between its resources and the channel.
After the channel has been accepted, the channel is usable for communication between the tenant in P and the tenant in Q.
Channel Identifier

Each channel has a neutral channel identifier. The local tenant identifiers in the connected systems map to this channel identifier. The mapping is established on the initiating side when the channel is created and on the accepting side when it is accepted.
When events flow from P to Q, system P will look up the channel identifier for its tenant and annotate the communication flow with the channel identifier. System Q will then use that channel identifier to find the target tenant for routing the communication.
Annotating the communication flow means that the HTTP call or AMQP (etc.) message carries the channel identifier somewhere in the transport message. The channel identifier is only meaningful for the handover hop.